-
2
-
-
0025543907
-
Speech recognition in noisy environments with the aid of microphone arrays
-
D Van Compernolle,WMa, F Xie, and M Van Diest, "Speech recognition in noisy environments with the aid of microphone arrays," Speech Commun., vol. 9, pp. 433-442, 1990.
-
(1990)
Speech Commun.
, vol.9
, pp. 433-442
-
-
Van Compernolle, D.1
Ma, W.2
Xie, F.3
Van Diest, M.4
-
3
-
-
0029725933
-
Microphonearray speech recognition via incremental MAP training
-
JE Adcock, Y Gotoh, DJ Mashao, and HF Silverman, "Microphonearray speech recognition via incremental MAP training," in Proc IEEE ICASSP, 1996, pp. 897-900.
-
(1996)
Proc IEEE ICASSP
, pp. 897-900
-
-
Adcock, J.E.1
Gotoh, Y.2
Mashao, D.J.3
Silverman, H.F.4
-
4
-
-
0030676367
-
Microphone array based speech recognition with different talker-array positions
-
M Omologo, M Matassoni, P Svaizer, and D Giuliani, "Microphone array based speech recognition with different talker-array positions," in Proc IEEE ICASSP, 1997, pp. 227-230.
-
(1997)
Proc IEEE ICASSP
, pp. 227-230
-
-
Omologo, M.1
Matassoni, M.2
Svaizer, P.3
Giuliani, D.4
-
5
-
-
33846217002
-
The multi-channel Wall Street Journal audio visual corpus (MC-WSJ-AV): Specification and initial experiments
-
M Lincoln, I McCowan, J Vepa, and HK Maganti, "The multi-channel Wall Street Journal audio visual corpus (MC-WSJ-AV): Specification and initial experiments," in Proc IEEE ASRU, 2005.
-
(2005)
Proc IEEE ASRU
-
-
Lincoln, M.1
McCowan, I.2
Vepa, J.3
Maganti, H.K.4
-
6
-
-
84890443591
-
Recognition of overlapping speech using digital MEMS microphone arrays
-
E Zwyssig, F Faubel, S Renals, and M Lincoln, "Recognition of overlapping speech using digital MEMS microphone arrays," in Proc IEEE ICASSP, 2013.
-
(2013)
Proc IEEE ICASSP
-
-
Zwyssig, E.1
Faubel, F.2
Renals, S.3
Lincoln, M.4
-
7
-
-
85032751613
-
Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition
-
T Yoshioka, A Sehr,MDelcroix, K Kinoshita, R Maas, T Nakatani, and WKellermann, "Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition.," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 114-126, 2012.
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 114-126
-
-
Yoshioka, T.1
Sehr, A.2
Delcroix, M.3
Kinoshita, K.4
Maas, R.5
Nakatani, T.6
Kellermann, W.7
-
8
-
-
85032750883
-
Microphone array processing for distant speech recognition: From close-talking microphones to farfield sensors
-
K Kumatani, J McDonough, and B Raj, "Microphone array processing for distant speech recognition: From close-talking microphones to farfield sensors," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 127-140, 2012.
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 127-140
-
-
Kumatani, K.1
McDonough, J.2
Raj, B.3
-
9
-
-
0141814662
-
The ICSI meeting corpus
-
A Janin, D Baron, J Edwards, D Ellis, D Gelbart, N Morgan, B Peskin, T Pfau, E Shriberg, A Stolcke, and C Wooters, "The ICSI meeting corpus," in Proc IEEE ICASSP, 2003, pp. I364-I367.
-
(2003)
Proc IEEE ICASSP
-
-
Janin, A.1
Baron, D.2
Edwards, J.3
Ellis, D.4
Gelbart, D.5
Morgan, N.6
Peskin, B.7
Pfau, T.8
Shriberg, E.9
Stolcke, A.10
Wooters, C.11
-
10
-
-
35948981862
-
Unleashing the killer corpus: Experiences in creating the multi-everything AMI Meeting Corpus
-
J Carletta, "Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus," Language Resources & Evaluation, vol. 41, pp. 181-190, 2007.
-
(2007)
Language Resources & Evaluation
, vol.41
, pp. 181-190
-
-
Carletta, J.1
-
11
-
-
84893665400
-
The SRI-ICSI Spring 2007 meeting and lecture recognition system in LNCS
-
R Stiefelhagen, R Bowers, and J Fiscus, Eds. Springer
-
A Stolcke, X Anguera, K Boakye, O Cetin, A Janin, M Magimai-Doss, C Wooters, and J Zheng, "The SRI-ICSI Spring 2007 meeting and lecture recognition system," in Multimodal Technologies for Perception of Humans, R Stiefelhagen, R Bowers, and J Fiscus, Eds., number 4625 in LNCS, pp. 373-389. Springer, 2008.
-
(2008)
Multimodal Technologies for Perception of Humans
, Issue.4625
, pp. 373-389
-
-
Stolcke, A.1
Anguera, X.2
Boakye, K.3
Cetin, O.4
Janin, A.5
Magimai-Doss, M.6
Wooters, C.7
Zheng, J.8
-
12
-
-
85008520364
-
Transcribing meetings with the AMIDA systems
-
T Hain, L Burget, J Dines, PN Garner, F Grezl, AE Hannani, M Huijbregts, M Karafiat, M Lincoln, and V Wan, "Transcribing meetings with the AMIDA systems," IEEE Trans. Audio, Speech, & Language Process., vol. 20, pp. 486-498, 2012.
-
(2012)
IEEE Trans. Audio, Speech, & Language Process.
, vol.20
, pp. 486-498
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garner, P.N.4
Grezl, F.5
Hannani, A.E.6
Huijbregts, M.7
Karafiat, M.8
Lincoln, M.9
Wan, V.10
-
13
-
-
0036296863
-
Minimum phone error and I-smoothing for improved discriminative training
-
D Povey and PC Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc IEEE ICASSP, 2002, pp. 105-108.
-
(2002)
Proc IEEE ICASSP
, pp. 105-108
-
-
Povey, D.1
Woodland, P.C.2
-
14
-
-
0030362995
-
A compact model for speaker-adaptive training
-
T Anastasakos, J McDonough, R Schwartz, and J Makhoul, "A compact model for speaker-adaptive training," in Proc ICSLP, 1996, pp. 1137-1140.
-
(1996)
Proc ICSLP
, pp. 1137-1140
-
-
Anastasakos, T.1
McDonough, J.2
Schwartz, R.3
Makhoul, J.4
-
15
-
-
34547548235
-
Probabilistic and bottle-neck features for LVCSR of meetings
-
F Gre?zl, M Karafia?t, S Konta?r, and J Ci ernocky?, "Probabilistic and bottle-neck features for LVCSR of meetings," in Proc IEEE ICASSP, 2007, vol. 4, pp. IV-757-IV-760.
-
(2007)
Proc IEEE ICASSP
, vol.4
-
-
Grezl, F.1
Karafiat, M.2
Kontar, S.3
Ciernocky, J.4
-
16
-
-
50449092852
-
Bridging the gap: Towards a unified framework for handsfree speech recognition using microphone arrays
-
ML Seltzer, "Bridging the gap: Towards a unified framework for handsfree speech recognition using microphone arrays," in Proc HSCMA, 2008.
-
(2008)
Proc HSCMA
-
-
Seltzer, M.L.1
-
17
-
-
4344607755
-
Likelihood-maximizing beamforming for robust hands-free speech recognition
-
M Seltzer, B Raj, and R Stern, "Likelihood-maximizing beamforming for robust hands-free speech recognition," IEEE Trans. Speech, & Audio Process., vol. 12, pp. 489-498, 2004.
-
(2004)
IEEE Trans. Speech, & Audio Process.
, vol.12
, pp. 489-498
-
-
Seltzer, M.1
Raj, B.2
Stern, R.3
-
18
-
-
50449096811
-
Subband likelihood-maximizing beamforming for speech recognition in reverberant environments
-
M Seltzer and R Stern, "Subband likelihood-maximizing beamforming for speech recognition in reverberant environments," IEEE Trans. Audio, Speech, & Lang. Process., vol. 14, pp. 2109-2121, 2006.
-
(2006)
IEEE Trans. Audio, Speech, & Lang. Process.
, vol.14
, pp. 2109-2121
-
-
Seltzer, M.1
Stern, R.2
-
19
-
-
84865729496
-
An analysis of automatic speech recognition with multiple microphones
-
D Marino and T Hain, "An analysis of automatic speech recognition with multiple microphones," in Proc Interspeech, 2011, pp. 1281-1284.
-
(2011)
Proc Interspeech
, pp. 1281-1284
-
-
Marino, D.1
Hain, T.2
-
20
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G Hinton, L Deng, D Yu, GE Dahl, A-R Mohamed, N Jaitly, A Senior, V Vanhoucke, P Nguyen, TN Sainath, and B Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.-R.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
22
-
-
0028194709
-
Connectionist probability estimators in HMM speech recognition
-
S Renals, N Morgan, H Bourlard, M Cohen, and H Franco, "Connectionist probability estimators in HMM speech recognition," IEEE Trans Speech & Audio Process., vol. 2, pp. 161-174, 1994.
-
(1994)
IEEE Trans Speech & Audio Process
, vol.2
, pp. 161-174
-
-
Renals, S.1
Morgan, N.2
Bourlard, H.3
Cohen, M.4
Franco, H.5
-
23
-
-
0029308753
-
Neural networks for statistical recognition of continuous speech
-
N Morgan and H Bourlard, "Neural networks for statistical recognition of continuous speech," Proc IEEE, vol. 83, pp. 742-772, 1995.
-
(1995)
Proc IEEE
, vol.83
, pp. 742-772
-
-
Morgan, N.1
Bourlard, H.2
-
24
-
-
0036567797
-
Connectionist speech recognition of broadcast news
-
AJ Robinson, GD Cook, DPW Ellis, E Fosler-Lussier, SJ Renals, and DAGWilliams, "Connectionist speech recognition of broadcast news," Speech Commun., vol. 37, pp. 27-45, 2002.
-
(2002)
Speech Commun.
, vol.37
, pp. 27-45
-
-
Robinson, A.J.1
Cook, G.D.2
Ellis, D.P.W.3
Fosler-Lussier, E.4
Renals, S.J.5
Williams, D.A.G.6
-
25
-
-
84858972572
-
Making deep belief networks effective for large vocabulary continuous speech recognition
-
TN Sainath, B Kingsbury, B Ramabhadran, P Fousek, P Novak, and A Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition," in Proc IEEE ASRU, 2011.
-
(2011)
Proc IEEE ASRU
-
-
Sainath, T.N.1
Kingsbury, B.2
Ramabhadran, B.3
Fousek, P.4
Novak, P.5
Mohamed, A.6
-
26
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
GE Dahl, D Yu, L Deng, and A Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans Audio, Speech & Lang. Process., vol. 20, pp. 30-42, 2012.
-
(2012)
IEEE Trans Audio, Speech & Lang Process
, vol.20
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
27
-
-
84893704659
-
Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
-
P Swietojanski, A Ghoshal, and S Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition," in Proc IEEE ASRU, 2013.
-
(2013)
Proc IEEE ASRU
-
-
Swietojanski, P.1
Ghoshal, A.2
Renals, S.3
-
28
-
-
84874282188
-
Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM
-
J Li, D Yu, J-T Huang, and Y Gong, "Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM," in Proc IEEE SLT, 2012, pp. 131-136.
-
(2012)
Proc IEEE SLT
, pp. 131-136
-
-
Li, J.1
Yu, D.2
Huang, J.-T.3
Gong, Y.4
-
29
-
-
84890471125
-
On rectified linear units for speech processing
-
MD Zeiler, M Ranzato, R Monga, M Mao, K Yang, QV Le, P Nguyen, A Senior, V Vanhoucke, J Dean, and GE Hinton, "On rectified linear units for speech processing," in Proc IEEE ICASSP, 2013.
-
(2013)
Proc IEEE ICASSP
-
-
Zeiler, M.D.1
Ranzato, M.2
Monga, R.3
Mao, M.4
Yang, K.5
Le, Q.V.6
Nguyen, P.7
Senior, A.8
Vanhoucke, V.9
Dean, J.10
Hinton, G.E.11
-
30
-
-
84893651518
-
Deep maxout neural networks for speech recognition
-
M Cai, Y Shi, and J Liu, "Deep maxout neural networks for speech recognition," in Proc ASRU, 2013.
-
(2013)
Proc ASRU
-
-
Cai, M.1
Shi, Y.2
Liu, J.3
-
31
-
-
84893701756
-
Deep maxout networks for lowresource speech recognition
-
Y Miao, F Metze, and S Rawat, "Deep maxout networks for lowresource speech recognition," in Proc. IEEE ASRU, 2013.
-
(2013)
Proc. IEEE ASRU
-
-
Miao, Y.1
Metze, F.2
Rawat, S.3
-
34
-
-
84867605836
-
Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
-
O Abdel-Hamid, A-R Mohamed, J Hui, and G Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition," in Proc IEEE ICASSP, 2012, pp. 4277-4280.
-
(2012)
Proc IEEE ICASSP
, pp. 4277-4280
-
-
Abdel-Hamid, O.1
Mohamed, A.-R.2
Hui, J.3
Penn, G.4
-
35
-
-
0032203257
-
Gradient-based learning applied to document recognition
-
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc IEEE, vol. 86, pp. 2278-2324, 1998.
-
(1998)
Proc IEEE
, vol.86
, pp. 2278-2324
-
-
Lecun, Y.1
Bottou, L.2
Bengio, Y.3
Haffner, P.4
-
36
-
-
84893654379
-
Improvements to deep convolutional neural networks for LVCSR
-
TN Sainath, B Kingsbury, A Mohamed, GE Dahl, G Saon, H Soltau, T Beran, AY Aravkin, and B Ramabhadran, "Improvements to deep convolutional neural networks for LVCSR," in Proc IEEE ASRU, 2013.
-
(2013)
Proc IEEE ASRU
-
-
Sainath, T.N.1
Kingsbury, B.2
Mohamed, A.3
Dahl, G.E.4
Saon, G.5
Soltau, H.6
Beran, T.7
Aravkin, A.Y.8
Ramabhadran, B.9
-
37
-
-
0025254722
-
A time-delay neural network architecture for isolated
-
word recognition
-
KJ Lang, AH Waibel, and GE Hinton, "A time-delay neural network architecture for isolated word recognition," Neural Networks, vol. 3, pp. 23-43, 1990.
-
(1990)
Neural Networks
, vol.3
, pp. 23-43
-
-
Lang, K.J.1
Waibel, A.H.2
Hinton, G.E.3
-
38
-
-
84990059834
-
Rectified linear units improve restricted Boltzmann machines
-
V Nair and G Hinton, "Rectified linear units improve restricted Boltzmann machines," in Proc ICML, 2010, pp. 131-136.
-
(2010)
Proc ICML
, pp. 131-136
-
-
Nair, V.1
Hinton, G.2
-
39
-
-
84897543523
-
Maxout networks
-
IJ Goodfellow, D Warde-Farley, M Mirza, A Courville, and Y Bengio, "Maxout networks," in Proc ICML, 2013.
-
(2013)
Proc ICML
-
-
Goodfellow, I.J.1
Warde-Farley, D.2
Mirza, M.3
Courville, A.4
Bengio, Y.5
-
41
-
-
84903707061
-
Multiple dimension Levenshtein edit distance calculations for evaluating ASR systems during simultaneous speech
-
JG Fiscus, J Ajot, N Radde, and C Laprun, "Multiple dimension Levenshtein edit distance calculations for evaluating ASR systems during simultaneous speech," in Proc LREC, 2006.
-
(2006)
Proc LREC
-
-
Fiscus, J.G.1
Ajot, J.2
Radde, N.3
Laprun, C.4
-
42
-
-
84874276847
-
The Kaldi speech recognition toolkit
-
D Povey, A Ghoshal, G Boulianne, L Burget, O Glembek, N Goel, M Hannemann, P Motlicek, Y Qian, P Schwarz, J Silovsk?y, G Stemmer, and K Vesel?y, "The Kaldi speech recognition toolkit," in Proc IEEE ASRU, 2011.
-
(2011)
Proc IEEE ASRU
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
Silovsky, J.11
Stemmer, G.12
Vesely, K.13
-
43
-
-
84893401626
-
-
ArXiv: 1308.4214
-
IJ Goodfellow, D Warde-Farley, P Lamblin, V Dumoulin, M Mirza, R Pascanu, J Bergstra, F Bastien, and Y Bengio, "Pylearn2: a machine learning research library," arXiv:1308.4214, 2013.
-
(2013)
Pylearn2: A Machine Learning Research Library
-
-
Goodfellow, I.J.1
Warde-Farley, D.2
Lamblin, P.3
Dumoulin, V.4
Mirza, M.5
Pascanu, R.6
Bergstra, J.7
Bastien, F.8
Bengio, Y.9
-
44
-
-
79951563340
-
Understanding the difficulty of training deep feedforward neural networks
-
X Glorot and Y Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proc AISTATS, 2010.
-
(2010)
Proc AISTATS
-
-
Glorot, X.1
Bengio, Y.2
-
45
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
G Hinton, S Osindero, and Y Teh, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, pp. 1527-1554, 2006.
-
(2006)
Neural Computation
, vol.18
, pp. 1527-1554
-
-
Hinton, G.1
Osindero, S.2
Teh, Y.3
-
46
-
-
84863380535
-
Unsupervised feature learning for audio classification using convolutional deep belief networks
-
H Lee, P Pham, Y Largman, and A Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Proc NIPS 22, 2009, pp. 1096-1104.
-
(2009)
Proc NIPS
, vol.22
, pp. 1096-1104
-
-
Lee, H.1
Pham, P.2
Largman, Y.3
Ng, A.4
-
47
-
-
84864073449
-
Greedy layerwise training of deep networks
-
Y Bengio, P Lamblin, D Popovici, and H Larochelle, "Greedy layerwise training of deep networks," in Proc NIPS 19, 2007, pp. 153-160.
-
(2007)
Proc NIPS
, vol.19
, pp. 153-160
-
-
Bengio, Y.1
Lamblin, P.2
Popovici, D.3
Larochelle, H.4
-
48
-
-
51449120120
-
Boosted MMI for model and feature-space discriminative training
-
D Povey, D Kanevsky, B Kingsbury, B Ramabhadran, G Saon, and K Visweswariah, "Boosted MMI for model and feature-space discriminative training," in Proc IEEE ICASSP, 2008, pp. 4057-4060.
-
(2008)
Proc IEEE ICASSP
, pp. 4057-4060
-
-
Povey, D.1
Kanevsky, D.2
Kingsbury, B.3
Ramabhadran, B.4
Saon, G.5
Visweswariah, K.6
-
49
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
MJF Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans Speech and Audio Process., vol. 7, pp. 272-281, 1999.
-
(1999)
IEEE Trans Speech and Audio Process
, vol.7
, pp. 272-281
-
-
Gales, M.J.F.1
-
50
-
-
50449086237
-
Acoustic beamforming for speaker diarization of meetings
-
X Anguera, C Wooters, and J Hernando, "Acoustic beamforming for speaker diarization of meetings," IEEE Trans. Audio, Speech, & Lang. Process., vol. 15, pp. 2011-2021, 2007.
-
(2007)
IEEE Trans. Audio, Speech, & Lang. Process.
, vol.15
, pp. 2011-2021
-
-
Anguera, X.1
Wooters, C.2
Hernando, J.3
|