메뉴 건너뛰기




Volumn , Issue , 2014, Pages 5632-5636

Single-channel mixed speech recognition using deep neural networks

Author keywords

DNN; multi talker ASR; WFST

Indexed keywords

SPEECH RECOGNITION;

EID: 84905269210     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2014.6854681     Document Type: Conference Paper
Times cited : (25)

References (19)
  • 1
    • 69249202377 scopus 로고    scopus 로고
    • Monaural speech separation and recognition challenge
    • Martin Cooke, John R. Hershey, and Steven J. Rennie, "Monaural speech separation and recognition challenge.," Computer Speech and Language, vol. 24, no. 1, pp. 1-15, 2010.
    • (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 1-15
    • Cooke, M.1    Hershey, J.R.2    Rennie, S.J.3
  • 2
    • 44949258898 scopus 로고    scopus 로고
    • Super-human multi-talker speech recognition: The ibm 2006 speech separation challenge system
    • ISCA
    • Trausti T. Kristjansson, John R. Hershey, Peder A. Olsen, Steven J. Rennie, and Ramesh A. Gopinath, "Super-human multi-talker speech recognition: the ibm 2006 speech separation challenge system.," in INTERSPEECH. 2006, ISCA.
    • (2006) INTERSPEECH
    • Kristjansson, T.T.1    Hershey, J.R.2    Olsen, P.A.3    Rennie, S.J.4    Gopinath, R.A.5
  • 3
    • 44849140301 scopus 로고    scopus 로고
    • Speech recognition using factorial hidden markov models for separation in the feature space
    • ISCA
    • Tuomas Virtanen, "Speech recognition using factorial hidden markov models for separation in the feature space.," in INTERSPEECH. 2006, ISCA.
    • (2006) INTERSPEECH
    • Virtanen, T.1
  • 5
    • 0031268341 scopus 로고    scopus 로고
    • Factorial hidden markov models
    • Nov
    • Zoubin Ghahramani and Michael I. Jordan, "Factorial hidden markov models," Mach. Learn., vol. 29, no. 2-3, pp. 245-273, Nov. 1997.
    • (1997) Mach. Learn , vol.29 , Issue.2-3 , pp. 245-273
    • Ghahramani, Z.1    Jordan, M.I.2
  • 6
    • 69249231059 scopus 로고    scopus 로고
    • Speech fragment decoding techniques for simultaneous speaker identification and speech recognition
    • Jan
    • Jon Barker, Ning Ma, Andre Coy, and Martin Cooke, "Speech fragment decoding techniques for simultaneous speaker identification and speech recognition," Comput. Speech Lang., vol. 24, no. 1, pp. 94-111, Jan. 2010.
    • (2010) Comput. Speech Lang , vol.24 , Issue.1 , pp. 94-111
    • Barker, J.1    Ma, N.2    Coy, A.3    Cooke, M.4
  • 7
    • 44949179273 scopus 로고    scopus 로고
    • Combining missing-feature theory, speech enhancement and speakerdependent/-independent modeling for speech separation
    • ISCA
    • Ji Ming, Timothy J. Hazen, and James R. Glass, "Combining missing-feature theory, speech enhancement and speakerdependent/-independent modeling for speech separation.," in INTERSPEECH. 2006, ISCA.
    • (2006) INTERSPEECH
    • Ming, J.1    Hazen, T.J.2    Glass, J.R.3
  • 8
    • 69249159165 scopus 로고    scopus 로고
    • A computational auditory scene analysis system for speech segregation and robust speech recognition
    • Yang Shao, Soundararajan Srinivasan, Zhaozhang Jin, and DeLiangWang, "A computational auditory scene analysis system for speech segregation and robust speech recognition.," Computer Speech and Language, vol. 24, no. 1, pp. 77-93, 2010.
    • (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 77-93
    • Shao, Y.1    Srinivasan, S.2    Jin, Z.3    De Wang, L.4
  • 9
    • 44949110218 scopus 로고    scopus 로고
    • Single-channel speech separation using sparse non-negative matrix factorization
    • sep
    • M. N. Schmidt and R. K. Olsson, "Single-channel speech separation using sparse non-negative matrix factorization," in Interspeech, sep 2006.
    • (2006) Interspeech
    • Schmidt, M.N.1    Olsson, R.K.2
  • 10
    • 44949138160 scopus 로고    scopus 로고
    • Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm
    • Mark R. Every and Philip J. B. Jackson, "Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm.," in INTERSPEECH, 2006.
    • (2006) INTERSPEECH
    • Every, M.R.1    Jackson, P.J.B.2
  • 12
    • 84055222005 scopus 로고    scopus 로고
    • Contextdependent pre-trained deep neural networks for largevocabulary speech recognition
    • jan
    • G.E. Dahl, Dong Yu, Li Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for largevocabulary speech recognition," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 30-42, jan. 2012.
    • (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 13
    • 84874282188 scopus 로고    scopus 로고
    • Improving wideband speech recognition using mixed-bandwidth training data in cd-dnn-hmm
    • IEEE
    • Jinyu Li, Dong Yu, Jui-Ting Huang, and Yifan Gong, "Improving wideband speech recognition using mixed-bandwidth training data in cd-dnn-hmm.," in SLT. 2012, pp. 131-136, IEEE.
    • (2012) SLT , pp. 131-136
    • Li, J.1    Yu, D.2    Huang, J.-T.3    Gong, Y.4
  • 14
    • 85083953021 scopus 로고    scopus 로고
    • Feature learning in deep neural networks-A study on speech recognition tasks
    • abs/1301.3605
    • Dong Yu, Michael L. Seltzer, Jinyu Li, Jui-Ting Huang, and Frank Seide, "Feature learning in deep neural networks-A study on speech recognition tasks," CoRR, vol. abs/1301.3605, 2013.
    • (2013) CoRR
    • Yu, D.1    Seltzer, M.L.2    Li, J.3    Huang, J.-T.4    Seide, F.5
  • 15
    • 84890492030 scopus 로고    scopus 로고
    • An investigation of deep neural networks for noise robust speech recognition
    • M. L. Seltzer, D. Yu, and Y.-Q. Wang, "An investigation of deep neural networks for noise robust speech recognition," in Proc. ICASSP2013, 2013.
    • (2013) Proc. ICASSP2013
    • Seltzer, M.L.1    Yu, D.2    Wang, Y.-Q.3
  • 16
    • 0023263708 scopus 로고
    • Multi-style training for robust isolated-word speech recognition
    • R. Lippmann, E. Martin, and D.B. Paul, "Multi-style training for robust isolated-word speech recognition," in Proc. ICASSP1987, 1987.
    • (1987) Proc. ICASSP1987
    • Lippmann, R.1    Martin, E.2    Paul, D.B.3
  • 17
    • 33750368310 scopus 로고    scopus 로고
    • An audio-visual corpus for speech perception and automatic speech recognition
    • November
    • Martin Cooke, Jon Barker, Stuart Cunningham, and Xu Shao, "An audio-visual corpus for speech perception and automatic speech recognition," The Journal of the Acoustical Society of America, vol. 120, no. 5, pp. 2421-2424, November 2006.
    • (2006) The Journal of the Acoustical Society of America , vol.120 , Issue.5 , pp. 2421-2424
    • Cooke, M.1    Barker, J.2    Cunningham, S.3    Shao, X.4
  • 19
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • Frank Seide, Gang Li, Xie Chen, and Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in ASRU, 2011, pp. 24-29.
    • (2011) ASRU , pp. 24-29
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.