메뉴 건너뛰기




Volumn , Issue , 2013, Pages 7596-7599

Audio-visual deep learning for noise robust speech recognition

Author keywords

Audio visual speech recognition; Deep belief networks; Noise robustness

Indexed keywords

AUDIO VISUAL SPEECH RECOGNITION; AUTOMATIC SPEECH RECOGNITION; DECISION FUSION METHODS; DEEP BELIEF NETWORK (DBN); DEEP BELIEF NETWORKS; GAUSSIAN MIXTURE MODEL; NOISE ROBUST SPEECH RECOGNITION; NOISE ROBUSTNESS;

EID: 84890465549     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2013.6639140     Document Type: Conference Paper
Times cited : (196)

References (24)
  • 1
    • 0001048664 scopus 로고
    • Visual contribution to speech intelligibility in noise
    • W.H. Sumby and I. Pollack (1954), "Visual contribution to speech intelligibility in noise," in J. Acoustical Society America, 26: 212-215.
    • (1954) J. Acoustical Society America , vol.26 , pp. 212-215
    • Sumby, W.H.1    Pollack, I.2
  • 2
    • 0032074310 scopus 로고    scopus 로고
    • Audio-visual integration in multi-modal communication
    • T. Chen and R.R. Rao (1998), "Audio-visual integration in multi-modal communication," in Proc. IEEE, 86(5): 837-852.
    • (1998) Proc. IEEE , vol.86 , Issue.5 , pp. 837-852
    • Chen, T.1    Rao, R.R.2
  • 4
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • S. Dupont and J. Luettin (2000), "Audio-visual speech modeling for continuous speech recognition," in IEEE Trans. Multimedia, 2(3): 141-151.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 7
    • 4544290191 scopus 로고    scopus 로고
    • Recent advances in the automatic recognition of audio-visual speech
    • G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior (2003), "Recent advances in the automatic recognition of audio-visual speech," in Proc. IEEE, 91(9): 1306-1326.
    • (2003) Proc. IEEE , vol.91 , Issue.9 , pp. 1306-1326
    • Potamianos, G.1    Neti, C.2    Gravier, G.3    Garg, A.4    Senior, A.W.5
  • 9
    • 10444261199 scopus 로고    scopus 로고
    • Audio-visual speech recognition using an infrared headset
    • J. Huang, G. Potamianos, J. Connell and C. Neti (2004), "Audio-visual speech recognition using an infrared headset," in Speech Communication 44(4), 83-96.
    • (2004) Speech Communication , vol.44 , Issue.4 , pp. 83-96
    • Huang, J.1    Potamianos, G.2    Connell, J.3    Neti, C.4
  • 11
    • 0036296863 scopus 로고    scopus 로고
    • Minimum phone error and i-smoothing for improved discriminative training
    • D. Povey and P. C. Woodland, "Minimum Phone Error and I-smoothing for Improved Discriminative Training," in Proceedings of ICASSP, 2002.
    • (2002) Proceedings of ICASSP
    • Povey, D.1    Woodland, P.C.2
  • 13
    • 34047244134 scopus 로고    scopus 로고
    • Discriminatively trained features using fmpe for multi-stream audio-visual speech recognition
    • J. Huang and D. Povey, "Discriminatively Trained Features Using fMPE for Multi-Stream Audio-Visual Speech Recognition," in Proceedings of Interspeech, 2005.
    • (2005) Proceedings of Interspeech
    • Huang, J.1    Povey, D.2
  • 14
    • 70450172282 scopus 로고    scopus 로고
    • Combined discriminative training for multi-stream hmm-based audio-visual speech recognition
    • J. Huang and K. Visweswariah, "Combined Discriminative Training for Multi-Stream HMM-based Audio-Visual Speech Recognition," in Proceedings of Interspeech, 2009.
    • (2009) Proceedings of Interspeech
    • Huang, J.1    Visweswariah, K.2
  • 16
    • 84867585919 scopus 로고    scopus 로고
    • Understanding how deep belief networks perform acoustic modelling
    • A. Mohamed, G. Hinton, G. Penn, "Understanding how Deep Belief Networks perform acoustic modelling," in Proceedings of ICASSP, 2012.
    • (2012) Proceedings of ICASSP
    • Mohamed, A.1    Hinton, G.2    Penn, G.3
  • 17
    • 85135321224 scopus 로고
    • See me, hear me: Integrating automatic speech recognition and lipreading
    • P. Duchnowski, U. Meier, and A. Waibel, "See me, hear me: Integrating automatic speech recognition and lipreading," in Proceedings of ICSLP, 1994.
    • (1994) Proceedings of ICSLP
    • Duchnowski, P.1    Meier, U.2    Waibel, A.3
  • 18
  • 24
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G. E. Hinton, S. Osindero, and Y. Teh. "A Fast Learning Algorithm for Deep Belief Nets," in Neural Computation, vol. 18, pp. 1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.