-
1
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
Nov
-
G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, Nov 2012.
-
(2012)
Signal Processing Magazine, IEEE
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.10
Kingsbury, B.11
-
2
-
-
84906928729
-
Report on the 10th iwslt evaluation campaign
-
Heidelberg; Germany
-
M. Cettolo, J. Niehues, S. Stüker, L. Bentivogli, and M. Federico, "Report on the 10th iwslt evaluation campaign," in Proc. IWSLT, Heidelberg; Germany, 2013, http://www.eubridge.eu/87282.php.
-
(2013)
Proc. IWSLT
-
-
Cettolo, M.1
Niehues, J.2
Stüker, S.3
Bentivogli, L.4
Federico, M.5
-
3
-
-
56149084455
-
Recent progress in the mit spoken lecture processing project
-
J. Glass, T. J. Hazen, S. Cyphers, I. Malioutov, D. Huynh, and R. Barzilay, "Recent Progress in the MIT Spoken Lecture Processing Project," in Proc. Interspeech, 2007. [Online]. Available: http://groups.csail.mit.edu/sls/publications/2007/Interspeech07- glass-lecture.pdf
-
(2007)
Proc. Interspeech
-
-
Glass, J.1
Hazen, T.J.2
Cyphers, S.3
Malioutov, I.4
Huynh, D.5
Barzilay, R.6
-
4
-
-
0030266571
-
Closed-captioned television presentation speed and vocabulary
-
C. Jensema, R. McCann, and S. Ramsey, "Closed-captioned television presentation speed and vocabulary," American Annals of the deaf, vol. 141, no. 4, pp. 284-292, 1996.
-
(1996)
American Annals of the Deaf
, vol.141
, Issue.4
, pp. 284-292
-
-
Jensema, C.1
McCann, R.2
Ramsey, S.3
-
5
-
-
51449091001
-
Dynamic language model adaptation using presentation slides for lecture speech recognition
-
H. Yamazaki, K. Iwano, K. Shinoda, S. Furui, and H. Yokota, "Dynamic language model adaptation using presentation slides for lecture speech recognition," in In Proc. INTERSPEECH, 2007, pp. 2349-2352.
-
(2007)
Proc. INTERSPEECH
, pp. 2349-2352
-
-
Yamazaki, H.1
Iwano, K.2
Shinoda, K.3
Furui, S.4
Yokota, H.5
-
6
-
-
56149116530
-
Web-based language modelling for automatic lecture transcription
-
C. Munteanu, G. Penn, and R. Baecker, "Web-based language modelling for automatic lecture transcription," in Proc. INTERSPEECH, 2007.
-
(2007)
Proc. INTERSPEECH
-
-
Munteanu, C.1
Penn, G.2
Baecker, R.3
-
7
-
-
56149107305
-
Automatic transcription for a web 2.0 service to search podcasts
-
J. Ogata, M. Goto, and K. Eto, "Automatic transcription for a web 2.0 service to search podcasts," in INTERSPEECH 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007, 2007, pp. 2617-2620.
-
(2007)
INTERSPEECH 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007
, pp. 2617-2620
-
-
Ogata, J.1
Goto, M.2
Eto, K.3
-
8
-
-
79951777091
-
Toward better crowdsourced transcription: Transcription of a year of the let?s go bus information system data
-
G. Parent and M. Eskenazi, "Toward better crowdsourced transcription: Transcription of a year of the let?s go bus information system data," in Spoken Language Technology Workshop (SLT), 2010 IEEE. IEEE, 2010, pp. 312-317.
-
(2010)
Spoken Language Technology Workshop (SLT), 2010 IEEE. IEEE
, pp. 312-317
-
-
Parent, G.1
Eskenazi, M.2
-
9
-
-
78049407752
-
Using the amazon mechanical turk for transcription of spoken language
-
IEEE
-
M. Marge, S. Banerjee, and A. I. Rudnicky, "Using the amazon mechanical turk for transcription of spoken language," in Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010, pp. 5270-5273.
-
(2010)
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
, pp. 5270-5273
-
-
Marge, M.1
Banerjee, S.2
Rudnicky, A.I.3
-
10
-
-
79958275518
-
Cheap, fast and good enough: Automatic speech recognition with non-expert transcription
-
Los Angeles, California: Association for Computational Linguistics, June 2010
-
S. Novotney and C. Callison-Burch, "Cheap, fast and good enough: Automatic speech recognition with non-expert transcription," in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Los Angeles, California: Association for Computational Linguistics, June 2010, pp. 207-215. [Online]. Available: http://www.aclweb.org/anthology/N10-1024
-
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
, pp. 207-215
-
-
Novotney, S.1
Callison-Burch, C.2
-
11
-
-
84943243018
-
Evaluation of interactive user corrections for lecture transcription
-
Hong Kong, December 6-7, 2012
-
H. Kolkhorst, K. Kilgour, S. Stüker, and A. Waibel, "Evaluation of interactive user corrections for lecture transcription," in 2012 International Workshop on Spoken Language Translation, IWSLT 2012, Hong Kong, December 6-7, 2012, 2012, pp. 217-221.
-
(2012)
2012 International Workshop on Spoken Language Translation, IWSLT 2012
, pp. 217-221
-
-
Kolkhorst, H.1
Kilgour, K.2
Stüker, S.3
Waibel, A.4
-
12
-
-
84869046812
-
Real-time captioning by groups of non-experts
-
New York, USA: ACM Press, Oct
-
W. Lasecki, C. Miller, A. Sadilek, A. Abumoussa, D. Borrello, R. S. Kushalnagar, and J. Bigham, "Real-time captioning by groups of non-experts," in Proceedings of the 25th annual ACM symposium on User interface software and technology - UIST ?12. New York, New York, USA: ACM Press, Oct. 2012, pp. 23-34. [Online]. Available: http://dl.acm.org/citation.cfm?doid=2380116.2380122
-
(2012)
Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology - UIST ?12. New York
, pp. 23-34
-
-
Lasecki, W.1
Miller, C.2
Sadilek, A.3
Abumoussa, A.4
Borrello, D.5
Kushalnagar, R.S.6
Bigham, J.7
-
13
-
-
84959147559
-
Using keyword spotting to help humans correct captioning faster
-
Y. Gaur, F. Metze, Y. Miao, and J. P. Bigham, "Using keyword spotting to help humans correct captioning faster," in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
-
(2015)
Sixteenth Annual Conference of the International Speech Communication Association
-
-
Gaur, Y.1
Metze, F.2
Miao, Y.3
Bigham, J.P.4
-
14
-
-
84938721908
-
A keyword search system using open source software
-
South Lake Tahoe, NV; USA: IEEE, Dec to appear
-
J. Trmal, G. Chen, D. Povey, S. Khudanpur, P. Ghahremani, X. Zhang, V. Manohar, C. Liu, A. Jansen, D. Klakow, D. Yarowsky, and F. Metze, "A keyword search system using open source software," in Proc. IEEE Workshop on Spoken Language Technology. South Lake Tahoe, NV; USA: IEEE, Dec. 2014, to appear.
-
(2014)
Proc. IEEE Workshop on Spoken Language Technology
-
-
Trmal, J.1
Chen, G.2
Povey, D.3
Khudanpur, S.4
Ghahremani, P.5
Zhang, X.6
Manohar, V.7
Liu, C.8
Jansen, A.9
Klakow, D.10
Yarowsky, D.11
Metze, F.12
-
15
-
-
84946076428
-
Ted-lium: An automatic speech recognition dedicated corpus
-
A. Rousseau, P. Deléglise, and Y. Estève, "Ted-lium: an automatic speech recognition dedicated corpus." in LREC, 2012, pp. 125- 129.
-
(2012)
LREC
, pp. 125-129
-
-
Rousseau, A.1
Deléglise, P.2
Estève, Y.3
-
16
-
-
84893696682
-
The kaldi speech recognition toolkit
-
Dec.
-
D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky, G. Stemmer, and K. Vesely, "The kaldi speech recognition toolkit," in IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society, Dec. 2011.
-
(2011)
IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
Silovsky, J.11
Stemmer, G.12
Vesely, K.13
-
18
-
-
84964489732
-
EESEN: End-to-end speech recognition using deep rnn models and wfst-based decoding
-
Scottsdale, AZ; U.S.A.: IEEE, Dec
-
Y. Miao, M. Gowayyed, and F. Metze, "EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding," in Proc. Automatic Speech Recognition and Understanding Workshop (ASRU). Scottsdale, AZ; U.S.A.: IEEE, Dec. 2015, https://github.com/srvk/eesen.
-
(2015)
Proc. Automatic Speech Recognition and Understanding Workshop (ASRU)
-
-
Miao, Y.1
Gowayyed, M.2
Metze, F.3
-
19
-
-
84946091011
-
Scaling recurrent neural network language models
-
Brisbane; Australia: IEEE, May
-
W. Williams, N. Prasad, D. Mrva, T. Ash, and T. Robinson, "Scaling recurrent neural network language models," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. Brisbane; Australia: IEEE, May 2015.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
-
-
Williams, W.1
Prasad, N.2
Mrva, D.3
Ash, T.4
Robinson, T.5
-
20
-
-
33749259827
-
Connectionist temporal classification: Labelling unsegmented seq uence data with recurrent neural networks
-
ACM
-
A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, "Connectionist temporal classification: labelling unsegmented seq uence data with recurrent neural networks," in Proceedings of the 23rd international conference on Machine Learning. ACM, 2006, pp. 369-376.
-
(2006)
Proceedings of the 23rd International Conference on Machine Learning
, pp. 369-376
-
-
Graves, A.1
Fernández, S.2
Gomez, F.3
Schmidhuber, J.4
|