-
2
-
-
80051654520
-
Making the most from multiple microphones in meeting recognition
-
A. Stolcke, "Making the most from multiple microphones in meeting recognition," in Proc. IEEE ICASSP, 2011
-
(2011)
Proc. IEEE ICASSP
-
-
Stolcke, A.1
-
3
-
-
85032750883
-
Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors
-
K. Kumatani, J. McDonough, and B. Raj, "Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 127-140, 2012
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 127-140
-
-
Kumatani, K.1
McDonough, J.2
Raj, B.3
-
4
-
-
85008520364
-
Transcribing meetings with the AMIDA systems
-
T. Hain, L. Burget, J. Dines, P. N. Garner, F. Grezl, A. E. Hannani, M. Huijbregts, M. Karafiat, M. Lincoln, and V. Wan, "Transcribing meetings with the AMIDA systems," in IEEE Trans. Audio, Speech, Lang. Process., 2012, vol. 20, no. 2, pp. 486-498
-
(2012)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.20
, Issue.2
, pp. 486-498
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garner, P.N.4
Grezl, F.5
Hannani, A.E.6
Huijbregts, M.7
Karafiat, M.8
Lincoln, M.9
Wan, V.10
-
5
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, 2012
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.-R.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
7
-
-
0028194709
-
Connectionist probability estimators in HMM speech recognition
-
S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco, "Connectionist probability estimators in HMM speech recognition," IEEE Trans. Speech Audio Process., vol. 2, no. 1, pp. 161-174, 1994
-
(1994)
IEEE Trans. Speech Audio Process.
, vol.2
, Issue.1
, pp. 161-174
-
-
Renals, S.1
Morgan, N.2
Bourlard, H.3
Cohen, M.4
Franco, H.5
-
8
-
-
0029308753
-
Neural networks for statistical recognition of continuous speech
-
N. Morgan and H. Bourlard, "Neural networks for statistical recognition of continuous speech," in Proc. IEEE, 1995, vol. 83, no. 5, pp. 742-772
-
(1995)
Proc. IEEE
, vol.83
, Issue.5
, pp. 742-772
-
-
Morgan, N.1
Bourlard, H.2
-
9
-
-
0036567797
-
Connectionist speech recognition of Broadcast News
-
DOI 10.1016/S0167-6393(01)00058-9, PII S0167639301000589
-
A. J. Robinson, G. D. Cook, D. PW. Ellis, E. Fosler-Lussier, S. J. Renals, and D. AG. Williams, "Connectionist speech recognition of broadcast news," Speech Commun., vol. 37, no. 1-2, pp. 27-45, 2002 (Pubitemid 34222536)
-
(2002)
Speech Communication
, vol.37
, Issue.1-2
, pp. 27-45
-
-
Robinson, A.J.1
Cook, G.D.2
Ellis, D.P.W.3
Fosler-Lussier, E.4
Renals, S.J.5
Williams, D.A.G.6
-
10
-
-
84858972572
-
Making deep belief networks effective for large vocabulary continuous speech recognition
-
T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition," in Proc. IEEE ASRU, 2011
-
(2011)
Proc. IEEE ASRU
-
-
Sainath, T.N.1
Kingsbury, B.2
Ramabhadran, B.3
Fousek, P.4
Novak, P.5
Mohamed, A.6
-
11
-
-
84055222005
-
Context-dependent pretrained deep neural networks for large-vocabulary speech recognition
-
G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pretrained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Audio, Speech Lang. Process., vol. 20, no. 1, pp. 30-42, 2012
-
(2012)
IEEE Trans. Audio, Speech Lang. Process.
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
12
-
-
0033709098
-
Tandem connectionist feature extraction for conventional HMM systems
-
H. Hermansky, D. PW. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems," in Proc. IEEE ICASSP, 2000, pp. 1635-1638
-
(2000)
Proc. IEEE ICASSP
, pp. 1635-1638
-
-
Hermansky, H.1
Ellis, D.P.W.2
Sharma, S.3
-
13
-
-
33745528628
-
Using MLP features in SRI's conversational speech recognition system
-
Q. Zhu, A. Stolcke, B. Y. Chen, and N. Morgan, "Using MLP features in SRI's conversational speech recognition system," in Proc. Eurospeech, 2005
-
(2005)
Proc. Eurospeech
-
-
Zhu, Q.1
Stolcke, A.2
Chen, B.Y.3
Morgan, N.4
-
14
-
-
34547548235
-
Probabilistic and bottle-neck features for LVCSR of meetings
-
DOI 10.1109/ICASSP.2007.367023, 4218211, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
-
F. Grézl, M. Karafiát, S. Kontár, and J. Ernocký, "Probabilistic and bottle-neck features for LVCSR of meetings," Proc. IEEE ICASSP, vol. 4, pp. IV-757-IV-760, 2007 (Pubitemid 47178482)
-
(2007)
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
, vol.4
-
-
Grezl, F.1
Karafiat, M.2
Kontar, S.3
Cernocky, J.4
-
16
-
-
84893704659
-
Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
-
Dec.
-
P. Swietojanski, A. Ghoshal, and S. Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition," in Proc. IEEE ASRU, Dec. 2013
-
(2013)
Proc. IEEE ASRU
-
-
Swietojanski, P.1
Ghoshal, A.2
Renals, S.3
-
17
-
-
84874282188
-
Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM
-
J. Li, D. Yu, J.-T. Huang, and Y. Gong, "Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM," in Proc. IEEE SLT, 2012, pp. 131-136
-
(2012)
Proc. IEEE SLT
, pp. 131-136
-
-
Li, J.1
Yu, D.2
Huang, J.-T.3
Gong, Y.4
-
18
-
-
0002263996
-
Convolutional networks for images, speech and time series
-
Cambridge, MA, USA: MIT Press
-
Y. LeCun and Y. Bengio, "Convolutional networks for images, speech and time series," in The Handbook of Brain Theory and Neural Networks. Cambridge, MA, USA: MIT Press, 1995, pp. 255-258
-
(1995)
The Handbook of Brain Theory and Neural Networks
, pp. 255-258
-
-
Lecun, Y.1
Bengio, Y.2
-
19
-
-
0032203257
-
Gradient-based learning applied to document recognition
-
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, 1998
-
(1998)
Proc. IEEE
, vol.86
, Issue.11
, pp. 2278-2324
-
-
Lecun, Y.1
Bottou, L.2
Bengio, Y.3
Haffner, P.4
-
20
-
-
0024634603
-
Phoneme recognition using time-delay neural networks
-
DOI 10.1109/29.21701
-
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. J. Lang, "Phoneme recognition using time-delay neural networks," IEEE Trans. Audio, Speech Lang. Process., vol. 37, no. 3, pp. 328-339, 1989 (Pubitemid 19065785)
-
(1989)
IEEE Transactions on Acoustics, Speech, and Signal Processing
, vol.37
, Issue.3
, pp. 328-339
-
-
Waibel, A.1
Hanazawa, T.2
Hinton, G.3
Shikano, K.4
Lang, K.J.5
-
21
-
-
0025254722
-
A time-delay neural network architecture for isolated word recognition
-
K. J. Lang, A. H. Waibel, and G. E. Hinton, "A time-delay neural network architecture for isolated word recognition," Neural Netw., vol. 3, no. 1, pp. 23-43, 1990
-
(1990)
Neural Netw.
, vol.3
, Issue.1
, pp. 23-43
-
-
Lang, K.J.1
Waibel, A.H.2
Hinton, G.E.3
-
22
-
-
0027151530
-
Improving the MS-TDNN for word spotting
-
T. Zeppenfeld, R. Houghton, and A. Waibel, "Improving the MS-TDNN for word spotting," Proc. IEEE ICASSP, vol. 2, pp. 475-478, 1993
-
(1993)
Proc. IEEE ICASSP
, vol.2
, pp. 475-478
-
-
Zeppenfeld, T.1
Houghton, R.2
Waibel, A.3
-
23
-
-
79551521906
-
Convolutional networks for speech detection
-
S. Sukittanon, A. C. Surendran, J. C. Platt, and C. JC. Burges, "Convolutional networks for speech detection," in Proc. ICSLP, 2004
-
(2004)
Proc. ICSLP
-
-
Sukittanon, S.1
Surendran, A.C.2
Platt, J.C.3
Burges, C.J.C.4
-
24
-
-
84906273908
-
Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
-
D. Palaz, R. Collobert, and M. Magimai-Doss, "Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks," in Proc. Interspeech, 2013
-
(2013)
Proc. Interspeech
-
-
Palaz, D.1
Collobert, R.2
Magimai-Doss, M.3
-
25
-
-
84867605836
-
Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
-
O. Abdel-Hamid,A.-R. Mohamed, J. Hui, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition.," in Proc. IEEE ICASSP, 2012, pp. 4277-4280
-
(2012)
Proc. IEEE ICASSP
, pp. 4277-4280
-
-
Abdel-Hamid, O.1
Mohamed, A.-R.2
Hui, J.3
Penn, G.4
-
26
-
-
84890525984
-
Deep convolutional neural networks for LVCSR
-
T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for LVCSR," in Proc. IEEE ICASSP, 2013
-
(2013)
Proc. IEEE ICASSP
-
-
Sainath, T.N.1
Mohamed, A.2
Kingsbury, B.3
Ramabhadran, B.4
-
27
-
-
84893654379
-
Improvements to deep convolutional neural networks for LVCSR
-
T. N. Sainath, B. Kingsbury, A. Mohamed, G. E. Dahl, G. Saon, H. Soltau, T. Beran, A. Y. Aravkin, and B. Ramabhadran, "Improvements to deep convolutional neural networks for LVCSR," in Proc. IEEE ASRU, 2013
-
(2013)
Proc. IEEE ASRU
-
-
Sainath, T.N.1
Kingsbury, B.2
Mohamed, A.3
Dahl, G.E.4
Saon, G.5
Soltau, H.6
Beran, T.7
Aravkin, A.Y.8
Ramabhadran, B.9
-
28
-
-
35948981862
-
Unleashing the killer corpus: Experiences in creating the multi-everything AMI Meeting Corpus
-
J. Carletta, "Unleashing the killer corpus: Experiences in creating the multi-everything AMI Meeting Corpus," Lang. Res. Eval. J., vol. 41, no. 2, pp. 181-190, 2007
-
(2007)
Lang. Res. Eval. J.
, vol.41
, Issue.2
, pp. 181-190
-
-
Carletta, J.1
-
29
-
-
0001595997
-
Neural network classifiers estimate Bayesian a posteriori probabilities
-
M. D. Richard and R. P. Lippmann, "Neural network classifiers estimate Bayesian a posteriori probabilities," Neural Comput., vol. 3, no. 4, pp. 461-483, 1991
-
(1991)
Neural Comput.
, vol.3
, Issue.4
, pp. 461-483
-
-
Richard, M.D.1
Lippmann, R.P.2
-
30
-
-
51249118803
-
Unsupervised learning of invariant feature hierarchies with applications to object recognition
-
M. A. Ranzato, F. J. Huang, Y.-L. Boureau, and Y. LeCun, "Unsupervised learning of invariant feature hierarchies with applications to object recognition," in IEEE CVPR, 2007
-
(2007)
IEEE CVPR
-
-
Ranzato, M.A.1
Huang, F.J.2
Boureau, Y.-L.3
Lecun, Y.4
-
31
-
-
84902000293
-
-
Mar., [Online; accessed 27-March-2014]
-
"NumPy Reference," Mar. 2014 [Online]. Available: http://docs. scipy. org/doc/numpy/numpy-ref-1. 8. 1. pdf, [Online; accessed 27-March-2014]
-
(2014)
NumPy Reference
-
-
-
32
-
-
84906214784
-
Exploring convolutional neural network structures and optimisation techniques for speech recognition
-
ICSA
-
O. Abdel-Hamid, L. Deng, and D. Yu, "Exploring convolutional neural network structures and optimisation techniques for speech recognition," in Proc. Interspeech, 2013, ICSA
-
(2013)
Proc. Interspeech
-
-
Abdel-Hamid, O.1
Deng, L.2
Yu, D.3
-
33
-
-
0033316361
-
Hierarchical models of object recognition in cortex
-
DOI 10.1038/14819
-
M. Riesenhuber and T. Poggio, "Hierarchical models of object recognition in cortex," Nature Neurosci., vol. 2, pp. 1019-1025, 1999 (Pubitemid 30599567)
-
(1999)
Nature Neuroscience
, vol.2
, Issue.11
, pp. 1019-1025
-
-
Riesenhuber, M.1
Poggio, T.2
-
34
-
-
84903707061
-
Multiple dimension Levenshtein edit distance calculations for evaluating ASR systems during simultaneous speech
-
J. G. Fiscus, J. Ajot, N. Radde, and C. Laprun, "Multiple dimension Levenshtein edit distance calculations for evaluating ASR systems during simultaneous speech," in Proc. LREC, 2006
-
(2006)
Proc. LREC
-
-
Fiscus, J.G.1
Ajot, J.2
Radde, N.3
Laprun, C.4
-
35
-
-
84874276847
-
The Kaldi speech recognition toolkit
-
Dec.
-
D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlíek, Y. Qian, P. Schwarz, J. Silovský, G. Stemmer, and K. Veselý, "The Kaldi speech recognition toolkit," in Proc. IEEE ASRU, Dec. 2011
-
(2011)
Proc. IEEE ASRU
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlíek, P.8
Qian, Y.9
Schwarz, P.10
Silovský, J.11
Stemmer, G.12
Veselý, K.13
-
36
-
-
84893401626
-
-
arXiv preprint arXiv:1308. 4214
-
I. J. Goodfellow, D. Warde-Farley, P. Lamblin,V. Dumoulin,M. Mirza, R. Pascanu, J. Bergstra, F. Bastien, and Y. Bengio, "Pylearn2: A machine learning research library," arXiv preprint arXiv:1308. 4214 2013
-
(2013)
Pylearn2: A Machine Learning Research Library
-
-
Goodfellow, I.J.1
Warde-Farley, D.2
Lamblin, P.3
Dumoulin, V.4
Mirza, M.5
Pascanu, R.6
Bergstra, J.7
Bastien, F.8
Bengio, Y.9
-
37
-
-
85009224911
-
From switchboard to fisher: Telephone collection protocols, their uses and yields
-
C. Cieri, D. Miller, and K. Walker, "From switchboard to fisher: Telephone collection protocols, their uses and yields," in Proc. Eurospeech, 2003
-
(2003)
Proc. Eurospeech
-
-
Cieri, C.1
Miller, D.2
Walker, K.3
-
38
-
-
0033329799
-
An empirical study of smoothing techniques for language modeling
-
S. F. Chen and J. Goodman, "An empirical study of smoothing techniques for language modeling," Comput. Speech Lang., vol. 13, no. 4, pp. 359-393, 1999
-
(1999)
Comput. Speech Lang.
, vol.13
, Issue.4
, pp. 359-393
-
-
Chen, S.F.1
Goodman, J.2
-
39
-
-
51449120120
-
BoostedMMI for model and feature-space discriminative training
-
D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon, and K. Visweswariah, "BoostedMMI for model and feature-space discriminative training," in Proc. IEEE ICASSP, 2008, pp. 4057-4060
-
(2008)
Proc. IEEE ICASSP
, pp. 4057-4060
-
-
Povey, D.1
Kanevsky, D.2
Kingsbury, B.3
Ramabhadran, B.4
Saon, G.5
Visweswariah, K.6
-
40
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
MJF Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp. 272-281, 1999
-
(1999)
IEEE Trans. Speech Audio Process.
, vol.7
, Issue.3
, pp. 272-281
-
-
Gales, M.1
-
41
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
DOI 10.1162/neco.2006.18.7.1527
-
G. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., vol. 18, pp. 1527-1554, 2006 (Pubitemid 44024729)
-
(2006)
Neural Computation
, vol.18
, Issue.7
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.-W.3
-
42
-
-
50449086237
-
Acoustic beamforming for speaker diarization of meetings
-
X. Anguera, C. Wooters, and J. Hernando, "Acoustic beamforming for speaker diarization of meetings," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 2011-2021, 2007
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.15
, Issue.7
, pp. 2011-2021
-
-
Anguera, X.1
Wooters, C.2
Hernando, J.3
-
43
-
-
84863380535
-
Unsupervised feature learning for audio classification using convolutional deep belief networks
-
H. Lee, P. Pham, Y. Largman, and A. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," Adv. Neural Inf. Process. Syst. 22, pp. 1096-1104, 2009.
-
(2009)
Adv. Neural Inf. Process. Syst.
, vol.22
, pp. 1096-1104
-
-
Lee, H.1
Pham, P.2
Largman, Y.3
Ng, A.4
|