-
1
-
-
0011990786
-
The meeting project at ICSI
-
N. Morgan, D. Baron, J. Edwards, D. Ellis, D. Gelbart, A. Janin, T. Pfau, E. Shriberg, and A. Stolcke, "The Meeting Project at ICSI, " in Proceedings of the First International Conference on Human Language Technology Research, 2001, pp. 1-7.
-
(2001)
Proceedings of the First International Conference on Human Language Technology Research
, pp. 1-7
-
-
Morgan, N.1
Baron, D.2
Edwards, J.3
Ellis, D.4
Gelbart, D.5
Janin, A.6
Pfau, T.7
Shriberg, E.8
Stolcke, A.9
-
2
-
-
0141469852
-
Multispeaker speech activity detection for the ICSI meeting recorder
-
T. Pfau, D. P. Ellis, and A. Stolcke, "Multispeaker Speech Activity Detection for the ICSI Meeting Recorder, " in Proceedings of Automatic Speech Recognition and Understanding, 2001, pp. 107-110.
-
(2001)
Proceedings of Automatic Speech Recognition and Understanding
, pp. 107-110
-
-
Pfau, T.1
Ellis, D.P.2
Stolcke, A.3
-
3
-
-
0036293830
-
An overview of automatic speaker recognition technology
-
D. A. Reynolds, "An overview of automatic speaker recognition technology, " in Proceedings of ICASSP, vol. 4, 2002, pp. 4072- 4075.
-
(2002)
Proceedings of ICASSP
, vol.4
, pp. 4072-4075
-
-
Reynolds, D.A.1
-
4
-
-
84873315510
-
Unsupervised speech activity detection using voicing measures and perceptual spectral flux
-
IEEE
-
S. Sadjadi and J. Hansen, "Unsupervised speech activity detection using voicing measures and perceptual spectral flux, " Signal Processing Letters, IEEE, vol. 20, pp. 197-200, 2013.
-
(2013)
Signal Processing Letters
, vol.20
, pp. 197-200
-
-
Sadjadi, S.1
Hansen, J.2
-
5
-
-
34047272330
-
Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
-
N. Mesgarani, M. Slaney, and S. A. Shamma, "Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 14, no. 3, pp. 920-930, 2006.
-
(2006)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.14
, Issue.3
, pp. 920-930
-
-
Mesgarani, N.1
Slaney, M.2
Shamma, S.A.3
-
6
-
-
84878535284
-
Developing a speech activity detection system for the DARPA RATS program
-
T. Ng, B. Zhang, L. Nguyen, S. Matsoukas, K. Vesely, P. Matejka, X. Zhu, and N. Mesgarani, "Developing a speech activity detection system for the DARPA RATS program, " in Proceedings of InterSpeech, 2012.
-
(2012)
Proceedings of Inter Speech
-
-
Ng, T.1
Zhang, B.2
Nguyen, L.3
Matsoukas, S.4
Vesely, K.5
Matejka, P.6
Zhu, X.7
Mesgarani, N.8
-
7
-
-
79959838316
-
Voice activity detection based on conditional random fields using multiple features
-
A. Saito, Y. Nankaku, A. Lee, and K. Tokuda, "Voice activity detection based on conditional random fields using multiple features, " in Proceedings of InterSpeech, 2010, pp. 2086-2089.
-
(2010)
Proceedings of InterSpeech
, pp. 2086-2089
-
-
Saito, A.1
Nankaku, Y.2
Lee, A.3
Tokuda, K.4
-
8
-
-
80051623447
-
Speaker diarization of heterogeneous web video files: A preliminary study
-
P. Clement, T. Bazillon, and C. Fredouille, "Speaker diarization of heterogeneous web video files: A preliminary study, " in Proceedings of ICASSP, 2011, pp. 4432-4435.
-
(2011)
Proceedings of ICASSP
, pp. 4432-4435
-
-
Clement, P.1
Bazillon, T.2
Fredouille, C.3
-
9
-
-
84878610785
-
Speech/nonspeech segmentation in web videos
-
A. Misra, "Speech/nonspeech segmentation in web videos, " in Proceedings of InterSpeech, 2012.
-
(2012)
Proceedings of InterSpeech
-
-
Misra, A.1
-
10
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
G. E. Hinton, S. Osindero, and Y.-W. Teh, "A fast learning algorithm for deep belief nets, " Neural Computation, vol. 18, no. 7, pp. 1527-1554, 2006.
-
(2006)
Neural Computation
, vol.18
, Issue.7
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.-W.3
-
11
-
-
84937454179
-
Creating HAVIC: Heterogeneous audio visual internet collection
-
S. Strassel, A. Morris, J. Fiscus, C. Caruso, H. Lee, P. Over, J. Fiumara, B. Shaw, B. Antonishek, and M. Michel, "Creating HAVIC: Heterogeneous Audio Visual Internet Collection, " in Proceedings of the Eight International Conference on Language Resources and Evaluation, 2012.
-
(2012)
Proceedings of the Eight International Conference on Language Resources and Evaluation
-
-
Strassel, S.1
Morris, A.2
Fiscus, J.3
Caruso, C.4
Lee, H.5
Over, P.6
Fiumara, J.7
Shaw, B.8
Antonishek, B.9
Michel, M.10
-
13
-
-
33745577702
-
The rich transcription 2005 spring meeting recognition evaluation
-
J. Fiscus, N. Radde, J. Garofolo, A. Le, J. Ajot, and C. Laprun, "The Rich Transcription 2005 Spring Meeting Recognition Evaluation, " Machine Learning for Multimodal Interaction, pp. 369- 389, 2006.
-
(2006)
Machine Learning for Multimodal Interaction
, pp. 369-389
-
-
Fiscus, J.1
Radde, N.2
Garofolo, J.3
Le, A.4
Ajot, J.5
Laprun, C.6
-
14
-
-
23344452899
-
Statistical voice activity detection using a multiple observation likelihood ratio test
-
IEEE
-
J. Ramirez, J. C. Segura, C. Benitez, L. Garcia, and A. Rubio, "Statistical voice activity detection using a multiple observation likelihood ratio test, " Signal Processing Letters, IEEE, vol. 12, no. 10, pp. 689-692, 2005.
-
(2005)
Signal Processing Letters
, vol.12
, Issue.10
, pp. 689-692
-
-
Ramirez, J.1
Segura, J.C.2
Benitez, C.3
Garcia, L.4
Rubio, A.5
-
16
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 30-42, 2012.
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
|