SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 5632-5636

Single-channel mixed speech recognition using deep neural networks

(4) Weng, Chao a Yu, Dong b Seltzer, Michael L b Droppo, Jasha b

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b MICROSOFT RESEARCH (United States)

Author keywords

DNN; multi talker ASR; WFST

Indexed keywords

SPEECH RECOGNITION;

DEEP NEURAL NETWORKS; DNN; MULTI-TALKER ASR; NOISE ROBUSTNESS; SIMILAR PATTERN; SPEECH SEPARATION; TRAINING STRATEGY; WFST;

SIGNAL PROCESSING;

EID: 84905269210 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854681 Document Type: Conference Paper

Times cited : (25)

References (19)

1
- 69249202377
- Monaural speech separation and recognition challenge
- Martin Cooke, John R. Hershey, and Steven J. Rennie, "Monaural speech separation and recognition challenge.," Computer Speech and Language, vol. 24, no. 1, pp. 1-15, 2010.
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 1-15
- Cooke, M.¹ Hershey, J.R.² Rennie, S.J.³

2
- 44949258898
- Super-human multi-talker speech recognition: The ibm 2006 speech separation challenge system
- ISCA
- Trausti T. Kristjansson, John R. Hershey, Peder A. Olsen, Steven J. Rennie, and Ramesh A. Gopinath, "Super-human multi-talker speech recognition: the ibm 2006 speech separation challenge system.," in INTERSPEECH. 2006, ISCA.
- (2006) INTERSPEECH
- Kristjansson, T.T.¹ Hershey, J.R.² Olsen, P.A.³ Rennie, S.J.⁴ Gopinath, R.A.⁵

3
- 44849140301
- Speech recognition using factorial hidden markov models for separation in the feature space
- ISCA
- Tuomas Virtanen, "Speech recognition using factorial hidden markov models for separation in the feature space.," in INTERSPEECH. 2006, ISCA.
- (2006) INTERSPEECH
- Virtanen, T.¹

4
- 50249086925
- Monaural speech separation using source-adapted models
- R. J. Weiss and D. P. W. Ellis, "Monaural Speech Separation Using Source-Adapted Models," in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2007, pp. 114-117.
- (2007) Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) , pp. 114-117
- Weiss, R.J.¹ Ellis, D.P.W.²

5
- 0031268341
- Factorial hidden markov models
- Nov
- Zoubin Ghahramani and Michael I. Jordan, "Factorial hidden markov models," Mach. Learn., vol. 29, no. 2-3, pp. 245-273, Nov. 1997.
- (1997) Mach. Learn , vol.29 , Issue.2-3 , pp. 245-273
- Ghahramani, Z.¹ Jordan, M.I.²

6
- 69249231059
- Speech fragment decoding techniques for simultaneous speaker identification and speech recognition
- Jan
- Jon Barker, Ning Ma, Andre Coy, and Martin Cooke, "Speech fragment decoding techniques for simultaneous speaker identification and speech recognition," Comput. Speech Lang., vol. 24, no. 1, pp. 94-111, Jan. 2010.
- (2010) Comput. Speech Lang , vol.24 , Issue.1 , pp. 94-111
- Barker, J.¹ Ma, N.² Coy, A.³ Cooke, M.⁴

7
- 44949179273
- Combining missing-feature theory, speech enhancement and speakerdependent/-independent modeling for speech separation
- ISCA
- Ji Ming, Timothy J. Hazen, and James R. Glass, "Combining missing-feature theory, speech enhancement and speakerdependent/-independent modeling for speech separation.," in INTERSPEECH. 2006, ISCA.
- (2006) INTERSPEECH
- Ming, J.¹ Hazen, T.J.² Glass, J.R.³

8
- 69249159165
- A computational auditory scene analysis system for speech segregation and robust speech recognition
- Yang Shao, Soundararajan Srinivasan, Zhaozhang Jin, and DeLiangWang, "A computational auditory scene analysis system for speech segregation and robust speech recognition.," Computer Speech and Language, vol. 24, no. 1, pp. 77-93, 2010.
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 77-93
- Shao, Y.¹ Srinivasan, S.² Jin, Z.³ De Wang, L.⁴

9
- 44949110218
- Single-channel speech separation using sparse non-negative matrix factorization
- sep
- M. N. Schmidt and R. K. Olsson, "Single-channel speech separation using sparse non-negative matrix factorization," in Interspeech, sep 2006.
- (2006) Interspeech
- Schmidt, M.N.¹ Olsson, R.K.²

10
- 44949138160
- Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm
- Mark R. Every and Philip J. B. Jackson, "Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm.," in INTERSPEECH, 2006.
- (2006) INTERSPEECH
- Every, M.R.¹ Jackson, P.J.B.²

11
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Geoffrey E. Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and Brian Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal Process. Mag , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.E.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Kingsbury, B.¹¹

12
- 84055222005
- Contextdependent pre-trained deep neural networks for largevocabulary speech recognition
- jan
- G.E. Dahl, Dong Yu, Li Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for largevocabulary speech recognition," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 30-42, jan. 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

13
- 84874282188
- Improving wideband speech recognition using mixed-bandwidth training data in cd-dnn-hmm
- IEEE
- Jinyu Li, Dong Yu, Jui-Ting Huang, and Yifan Gong, "Improving wideband speech recognition using mixed-bandwidth training data in cd-dnn-hmm.," in SLT. 2012, pp. 131-136, IEEE.
- (2012) SLT , pp. 131-136
- Li, J.¹ Yu, D.² Huang, J.-T.³ Gong, Y.⁴

14
- 85083953021
- Feature learning in deep neural networks-A study on speech recognition tasks
- abs/1301.3605
- Dong Yu, Michael L. Seltzer, Jinyu Li, Jui-Ting Huang, and Frank Seide, "Feature learning in deep neural networks-A study on speech recognition tasks," CoRR, vol. abs/1301.3605, 2013.
- (2013) CoRR
- Yu, D.¹ Seltzer, M.L.² Li, J.³ Huang, J.-T.⁴ Seide, F.⁵

15
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- M. L. Seltzer, D. Yu, and Y.-Q. Wang, "An investigation of deep neural networks for noise robust speech recognition," in Proc. ICASSP2013, 2013.
- (2013) Proc. ICASSP2013
- Seltzer, M.L.¹ Yu, D.² Wang, Y.-Q.³

16
- 0023263708
- Multi-style training for robust isolated-word speech recognition
- R. Lippmann, E. Martin, and D.B. Paul, "Multi-style training for robust isolated-word speech recognition," in Proc. ICASSP1987, 1987.
- (1987) Proc. ICASSP1987
- Lippmann, R.¹ Martin, E.² Paul, D.B.³

17
- 33750368310
- An audio-visual corpus for speech perception and automatic speech recognition
- November
- Martin Cooke, Jon Barker, Stuart Cunningham, and Xu Shao, "An audio-visual corpus for speech perception and automatic speech recognition," The Journal of the Acoustical Society of America, vol. 120, no. 5, pp. 2421-2424, November 2006.
- (2006) The Journal of the Acoustical Society of America , vol.120 , Issue.5 , pp. 2421-2424
- Cooke, M.¹ Barker, J.² Cunningham, S.³ Shao, X.⁴

18
- 84055211743
- Acoustic modeling using deep belief networks
- jan
- A. Mohamed, G.E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 14-22, jan. 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

19
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- Frank Seide, Gang Li, Xie Chen, and Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in ASRU, 2011, pp. 24-29.
- (2011) ASRU , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.