-
2
-
-
47749152568
-
The rich transcription 2007 meeting recognition evaluation
-
R Stiefelhagen, R Bowers, and J Fiscus, Eds. number 4625 in Lecture Notes in Computer Science Volume
-
J Fiscus, J Ajot, and J Garofolo, "The rich transcription 2007 meeting recognition evaluation, " in Multimodal Technologies for Perception of Humans, R Stiefelhagen, R Bowers, and J Fiscus, Eds., number 4625 in Lecture Notes in Computer Science Volume, pp. 373-389. 2008.
-
(2008)
Multimodal Technologies for Perception of Humans
, pp. 373-389
-
-
Fiscus, J.1
Ajot, J.2
Garofolo, J.3
-
3
-
-
85008520364
-
Transcribing meetings with the amida systems
-
T Hain, L Burget, J Dines, PN Garner, F Grezl, AE Hannani, M Huijbregts, M Karafiat, M Lincoln, and VWan, "Transcribing meetings with the AMIDA systems, " IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 2, pp. 486-498, 2012.
-
(2012)
IEEE Trans. Audio, Speech, Language Process
, vol.20
, Issue.2
, pp. 486-498
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garner, P.N.4
Grezl, F.5
Hannani, A.E.6
Huijbregts, M.7
Karafiat, M.8
Lincoln, M.9
Wan, V.10
-
4
-
-
84893665400
-
The SRI-ICSI spring 2007 meeting and lecture recognition system
-
R Stiefelhagen, R Bowers, and J Fiscus, Eds. number 4625 in Lecture Notes in Computer Science Volume
-
A Stolcke, X Anguera, K Boakye, O Cetin, A Janin, M Magimai-Doss, C Wooters, and J Zheng, "The SRI-ICSI Spring 2007 meeting and lecture recognition system, " in Multimodal Technologies for Perception of Humans, R Stiefelhagen, R Bowers, and J Fiscus, Eds., number 4625 in Lecture Notes in Computer Science Volume, pp. 373-389. 2008.
-
(2008)
Multimodal Technologies for Perception of Humans
, pp. 373-389
-
-
Stolcke, A.1
Anguera, X.2
Boakye, K.3
Cetin, O.4
Janin, A.5
Magimai-Doss, M.6
Wooters, C.7
Zheng, J.8
-
5
-
-
0016990291
-
The generalized correlation method for estimation of time delay
-
CH Knapp and GC Carter, "The generalized correlation method for estimation of time delay, " IEEE Trans. Acoust., Speech, Signal Process., vol. 24, no. 4, pp. 320-327, 1976.
-
(1976)
IEEE Trans. Acoust., Speech, Signal Process
, vol.24
, Issue.4
, pp. 320-327
-
-
Knapp, C.H.1
Carter, G.C.2
-
6
-
-
50449086237
-
Acoustic beamforming for speaker diarization of meetings
-
X Anguera, CWooters, and J Hernando, "Acoustic beamforming for speaker diarization of meetings, " IEEE Trans. Audio, Speech, Language Process., vol. 15, no. 7, pp. 2011-2021, 2007.
-
(2007)
IEEE Trans. Audio, Speech, Language Process
, vol.15
, Issue.7
, pp. 2011-2021
-
-
Anguera, X.1
Wooters, C.2
Hernando, J.3
-
7
-
-
0141603901
-
Superdirective microphone arrays
-
M Brandstein and D Ward, Eds., Springer
-
J Bitzer and KU Simmer, "Superdirective microphone arrays, " in Microphone Arrays, M Brandstein and D Ward, Eds., pp. 19-38. Springer, 2001.
-
(2001)
Microphone Arrays
, pp. 19-38
-
-
Bitzer, J.1
Simmer, K.U.2
-
8
-
-
67651154520
-
Beamforming with a maximum negentropy criterion
-
K Kumatani, J McDonough, B Rauch, D Klakow, PN Garner, and W Li, "Beamforming with a maximum negentropy criterion, " IEEE Trans. Audio, Speech, Language Process., vol. 17, no. 5, pp. 994-1008, 2009.
-
(2009)
IEEE Trans. Audio, Speech, Language Process
, vol.17
, Issue.5
, pp. 994-1008
-
-
Kumatani, K.1
McDonough, J.2
Rauch, B.3
Klakow, D.4
Garner, P.N.5
Li, W.6
-
9
-
-
50449096811
-
Subband likelihood-maximizing beamforming for speech recognition in reverberant environments
-
M Seltzer and R Stern, "Subband likelihood-maximizing beamforming for speech recognition in reverberant environments, " IEEE Trans. Audio, Speech, Language Process., vol. 14, pp. 2109-2121, 2006.
-
(2006)
IEEE Trans. Audio, Speech, Language Process
, vol.14
, pp. 2109-2121
-
-
Seltzer, M.1
Stern, R.2
-
10
-
-
85008590333
-
Low-latency real-time meeting recognition and understanding using distant microphones and omni-directional camera
-
T Hori, S Araki, T Yoshioka, M Fujimoto, SWatanabe, T Oba, A Ogawa, K Otsuka, D Mikami, K Kinoshita, T Nakatani, A Nakamura, and J Yamoto, "Low-latency real-time meeting recognition and understanding using distant microphones and omni-directional camera, " IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 2, pp. 499-513, 2012.
-
(2012)
IEEE Trans. Audio, Speech, Language Process
, vol.20
, Issue.2
, pp. 499-513
-
-
Hori, T.1
Araki, S.2
Yoshioka, T.3
Fujimoto, M.4
Watanabe, S.5
Oba, T.6
Ogawa, A.7
Otsuka, K.8
Mikami, D.9
Kinoshita, K.10
Nakatani, T.11
Nakamura, A.12
Yamoto, J.13
-
11
-
-
84867195294
-
Multi-source far-distance microphone selection and combination for automatic transcription of lectures
-
MWölfel, C Fügen, S Ikbal, and J McDonough, "Multi-source far-distance microphone selection and combination for automatic transcription of lectures, " in Proc ICSLP, 2006.
-
(2006)
Proc ICSLP
-
-
Wölfel, M.1
Fügen, C.2
Ikbal, S.3
McDonough, J.4
-
12
-
-
80051654520
-
Making the most from multiple microphones in meeting recognition
-
A Stolcke, "Making the most from multiple microphones in meeting recognition, " in Proc IEEE ICASSP, 2011.
-
(2011)
Proc IEEE ICASSP
-
-
Stolcke, A.1
-
13
-
-
84865729496
-
An analysis of automatic speech recognition with multiple microphones
-
D Marino and T Hain, "An analysis of automatic speech recognition with multiple microphones, " in INTERSPEECH, 2011, pp. 1281-1284.
-
(2011)
Interspeech
, pp. 1281-1284
-
-
Marino, D.1
Hain, T.2
-
14
-
-
84924139705
-
-
Cambridge University Press
-
S Renals, H Bourlard, J Carleta, and A Popescu-Belis, Multimodal Signal Processing, Cambridge University Press, 2012.
-
(2012)
Multimodal Signal Processing
-
-
Renals, S.1
Bourlard, H.2
Carleta, J.3
Popescu-Belis, A.4
-
15
-
-
84879854889
-
Representation learning: A review and new perspectives
-
Y Bengio, A Courville, and P Vincent, "Representation learning: A review and new perspectives, " IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798-1828, 2013.
-
(2013)
IEEE Trans. Pattern Anal. Mach. Intell.
, vol.35
, Issue.8
, pp. 1798-1828
-
-
Bengio, Y.1
Courville, A.2
Vincent, P.3
-
16
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G Hinton, L Deng, D Yu, GE Dahl, A-R Mohamed, N Jaitly, A Senior, V Vanhoucke, P Nguyen, TN Sainath, and B Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.-R.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
17
-
-
84858976070
-
Feature engineering in context-dependent deep neural networks for conversational speech transcription
-
F Seide, G Li, X Chen, and D Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. IEEE ASRU, 2011.
-
(2011)
Proc. IEEE ASRU
-
-
Seide, F.1
Li, G.2
Chen, X.3
Yu, D.4
-
18
-
-
85032750883
-
Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors
-
IEEE
-
K Kumatani, J McDonough, and B Raj, "Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 127-140, 2012.
-
(2012)
Signal Processing Magazine
, vol.29
, Issue.6
, pp. 127-140
-
-
Kumatani, K.1
McDonough, J.2
Raj, B.3
-
20
-
-
0028194709
-
Connectionist probability estimators in hmm speech recognition
-
S Renals, N Morgan, H Bourlard, M Cohen, and H Franco, "Connectionist probability estimators in HMM speech recognition, " IEEE Trans. Speech Audio Process., vol. 2, no. 1, pp. 161-174, 1994.
-
(1994)
IEEE Trans. Speech Audio Process
, vol.2
, Issue.1
, pp. 161-174
-
-
Renals, S.1
Morgan, N.2
Bourlard, H.3
Cohen, M.4
Franco, H.5
-
21
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
DOI 10.1162/neco.2006.18.7.1527
-
G Hinton, S Osindero, and Y Teh, "A fast learning algorithm for deep belief nets, " Neural Computation, vol. 18, pp. 1527- 1554, 2006. (Pubitemid 44024729)
-
(2006)
Neural Computation
, vol.18
, Issue.7
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.-W.3
-
22
-
-
34547548235
-
Probabilistic and bottle-neck features for LVCSR of meetings
-
DOI 10.1109/ICASSP.2007.367023, 4218211, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
-
F Grézl, M Karafiát, S Kontár, and J Černocký, "Probabilistic and bottle-neck features for LVCSR of meetings, " in Proc. ICASSP, 2007, vol. 4, pp. IV-757-IV-760. (Pubitemid 47178482)
-
(2007)
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
, vol.4
-
-
Grezl, F.1
Karafiat, M.2
Kontar, S.3
Cernocky, J.4
-
23
-
-
85037535397
-
Multiple dimension levenshtein edit distance calculations for evaluating automatic speech recognition systems during simultaneous speech
-
JG Fiscus, J Ajot, N Radde, and C Laprun, "Multiple dimension Levenshtein edit distance calculations for evaluating automatic speech recognition systems during simultaneous speech, " in Proc. LREC, 2006.
-
(2006)
Proc. LREC
-
-
Fiscus, J.G.1
Ajot, J.2
Radde, N.3
Laprun, C.4
-
24
-
-
0032638856
-
Semi-tied covariance matrices for hidden markov models
-
MJF Gales, "Semi-tied covariance matrices for hidden Markov models, " IEEE Trans. Speech, Audio Process., vol. 7, no. 3, pp. 272-281, 1999.
-
(1999)
IEEE Trans. Speech, Audio Process
, vol.7
, Issue.3
, pp. 272-281
-
-
Gales, M.J.F.1
-
25
-
-
85009231870
-
Qualcomm-icsi-ogi features for ASR
-
A Adami, L Burget, S Dupontb, H Garudadric, F Grezl, H Hermansky, P Jain, S Kajarekar, N Morgan, and S Sivadas, "Qualcomm-ICSI-OGI features for ASR, " in In Proc. ICSLP, 2002, pp. 21-24.
-
(2002)
Proc. ICSLP
, pp. 21-24
-
-
Adami, A.1
Burget, L.2
Dupontb, S.3
Garudadric, H.4
Grezl, F.5
Hermansky, H.6
Jain, P.7
Kajarekar, S.8
Morgan, N.9
Sivadas, S.10
-
26
-
-
51449120120
-
Boosted MMI for model and featurespace discriminative training
-
D Povey, D Kanevsky, B Kingsbury, B Ramabhadran, G Saon, and K Visweswariah, "Boosted MMI for model and featurespace discriminative training, " in Proc. IEEE ICASSP, 2008, pp. 4057-4060.
-
(2008)
Proc. IEEE ICASSP
, pp. 4057-4060
-
-
Povey, D.1
Kanevsky, D.2
Kingsbury, B.3
Ramabhadran, B.4
Saon, G.5
Visweswariah, K.6
-
27
-
-
84874276847
-
The kaldi speech recognition toolkit
-
D Povey, A Ghoshal, G Boulianne, L Burget, O Glembek, N Goel, M Hannemann, P Motlíček, Y Qian, P Schwarz, J Silovský, G Stemmer, and K Veselý, "The Kaldi speech recognition toolkit, " in Proc. IEEE ASRU, 2011.
-
(2011)
Proc. IEEE ASRU
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlíček, P.8
Qian, Y.9
Schwarz, P.10
Silovský, J.11
Stemmer, G.12
Veselý, K.13
-
28
-
-
84873443879
-
Theano: A CPU and GPU math expression compiler
-
J Bergstra, O Breuleux, F Bastien, P Lamblin, R Pascanu, G Desjardins, J Turian, D Warde-Farley, and Y Bengio, "Theano: A CPU and GPU math expression compiler, " in Proc. SciPy, 2010.
-
(2010)
Proc. SciPy
-
-
Bergstra, J.1
Breuleux, O.2
Bastien, F.3
Lamblin, P.4
Pascanu, R.5
Desjardins, G.6
Turian, J.7
Warde-Farley, D.8
Bengio, Y.9
-
29
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
GE Dahl, D Yu, L Deng, and A Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, " IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 1, pp. 30-42, 2012.
-
(2012)
IEEE Trans. Audio, Speech, Language Process
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
30
-
-
84874278045
-
Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR
-
Miami, Florida, USA, Dec
-
P Swietojanski, A Ghoshal, and S Renals, "Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR, " in Proc. IEEE Workshop on Spoken Language Technology, Miami, Florida, USA, Dec. 2012.
-
(2012)
Proc. IEEE Workshop on Spoken Language Technology
-
-
Swietojanski, P.1
Ghoshal, A.2
Renals, S.3
-
31
-
-
84906274730
-
Sequencediscriminative training of deep neural networks
-
K Veselý, A Ghoshal, L Burget, and D Povey, " Sequencediscriminative training of deep neural networks, " in Proc. INTERSPEECH, 2013.
-
(2013)
Proc. INTERSPEECH
-
-
Veselý, K.1
Ghoshal, A.2
Burget, L.3
Povey, D.4
-
32
-
-
84890492030
-
An investigation of deep neural networks for noise robust speech recognition
-
M Seltzer, D Yu, and Y Wang, "An investigation of deep neural networks for noise robust speech recognition, " in In Proc. ICASSP, 2013.
-
(2013)
Proc. ICASSP
-
-
Seltzer, M.1
Yu, D.2
Wang, Y.3
-
33
-
-
84890461500
-
Multilingual training of deep neural networks
-
A Ghoshal, P Swietojanski, and S Renals, "Multilingual training of deep neural networks, " in Proc. IEEE ICASSP, 2013, pp. 7319-7323.
-
(2013)
Proc. IEEE ICASSP
, pp. 7319-7323
-
-
Ghoshal, A.1
Swietojanski, P.2
Renals, S.3
|