SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 08-12-September-2016, Issue , 2016, Pages 2135-2139

Dynamic stream weighting for turbo-decoding-based audiovisual ASR

(5) Gergen, Sebastian a Zeiler, Steffen a Abdelaziz, Ahmed Hussen b Nickel, Robert c Kolossa, Dorothea a

a RUHR UNIVERSITY BOCHUM (Germany)

b INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

c BUCKNELL UNIVERSITY (United States)

Author keywords

Audiovisual speech recognition; Stream weighting; Turbo decoding

Indexed keywords

AUDIO ACOUSTICS; COST FUNCTIONS; DECODING; FACE RECOGNITION; HIDDEN MARKOV MODELS; MARKOV PROCESSES; SPEECH COMMUNICATION; SPEECH PROCESSING; VIDEO STREAMING;

AUDIO VISUAL SPEECH RECOGNITION; AUTOMATIC SPEECH RECOGNITION; COUPLED HIDDEN MARKOV MODELS; HUMAN MACHINE INTERACTION; RECOGNITION ACCURACY; SIGNAL DEGRADATION; STREAM WEIGHTING; TURBO DECODING;

SPEECH RECOGNITION;

EID: 84994339086 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: 10.21437/Interspeech.2016-166 Document Type: Conference Paper

Times cited : (34)

References (22)

1
- 0036874999
- Dynamic Bayesian networks for audio-visual speech recognition
- A. V. Nefian, L. Liang, X. Pi, X. Liu, and K. Murphy, "Dynamic Bayesian networks for audio-visual speech recognition, " EURASIP Journal on Applied Signal Processing, vol. 11, pp. 1-15, 2002.
- (2002) EURASIP Journal on Applied Signal Processing , vol.11 , pp. 1-15
- Nefian, A.V.¹ Liang, L.² Pi, X.³ Liu, X.⁴ Murphy, K.⁵

2
- 4544290191
- Recent advances in the automatic recognition of audiovisual speech
- Sep
- G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W. Senior, "Recent advances in the automatic recognition of audiovisual speech, " Proceedings of IEEE, vol. 91, no. 9, pp. 1306-1326, Sep 2003.
- (2003) Proceedings of IEEE , vol.91 , Issue.9 , pp. 1306-1326
- Potamianos, G.¹ Neti, C.² Gravier, G.³ Garg, A.⁴ Senior, A.W.⁵

3
- 84867337739
- Use of missing and unreliable data for audiovisual speech recognition
- D. Kolossa and R. Haeb-Umbach, Eds Springer
- A. Vorwerk, S. Zeiler, D. Kolossa, R. F. Astudillo, and D. Lerch, "Use of missing and unreliable data for audiovisual speech recognition, " in Robust Speech Recognition of Uncertain or Missing Data, D. Kolossa and R. Haeb-Umbach, Eds. Springer, 2011, pp. 345-375. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-21317-5
- (2011) Robust Speech Recognition of Uncertain or Missing Data , pp. 345-375
- Vorwerk, A.¹ Zeiler, S.² Kolossa, D.³ Astudillo, R.F.⁴ Lerch, D.⁵

4
- 84973384871
- A turbo-decoding weighted forward-backward algorithm for multimodal speech recognition
- S. Receveur, D. Scheler, and T. Fingscheidt, "A turbo-decoding weighted forward-backward algorithm for multimodal speech recognition, " in 5th International Workshop on Spoken Dialog Systems, 2014, pp. 4-15.
- (2014) 5th International Workshop on Spoken Dialog Systems , pp. 4-15
- Receveur, S.¹ Scheler, D.² Fingscheidt, T.³

5
- 80053437179
- Multimodal deep learning
- J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng, "Multimodal deep learning, " in Proceedings of the 28th International Conference on Machine Learning, 2011, pp. 1-8.
- (2011) Proceedings of the 28th International Conference on Machine Learning , pp. 1-8
- Ngiam, J.¹ Khosla, A.² Kim, M.³ Nam, J.⁴ Lee, H.⁵ Ng, A.Y.⁶

6
- 84986214282
- Audio-visual speech recognition using deep bottleneck features and high-performance lipreading
- S. Tamura, H. Ninomiya, N. Kitaoka, S. Osuga, Y. Iribe, K. Takeda, and S. Hayamizu, "Audio-visual speech recognition using deep bottleneck features and high-performance lipreading, " in Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015, pp. 575-582.
- (2015) Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) , pp. 575-582
- Tamura, S.¹ Ninomiya, H.² Kitaoka, N.³ Osuga, S.⁴ Iribe, Y.⁵ Takeda, K.⁶ Hayamizu, S.⁷

7
- 84946030446
- Deep multimodal learning for audio-visual speech recognition
- IEEE
- Y. Mroueh, E. Marcheret, and V. Goel, "Deep multimodal learning for audio-visual speech recognition, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015, pp. 2130-2134.
- (2015) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 2130-2134
- Mroueh, Y.¹ Marcheret, E.² Goel, V.³

8
- 0034270644
- Audio-visual speech modeling for continuous speech recognition
- S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition, " IEEE Transactions on Multimedia, vol. 2, no. 3, pp. 141-151, 2000. [Online]. Available: http://dx.doi.org/10.1109/6046.865479
- (2000) IEEE Transactions on Multimedia , vol.2 , Issue.3 , pp. 141-151
- Dupont, S.¹ Luettin, J.²

9
- 84905216437
- A new em estimation of dynamic stream weights for coupled-HMM-based audio-visual ASR
- IEEE
- A. H. Abdelaziz, S. Zeiler, and D. Kolossa, "A new EM estimation of dynamic stream weights for coupled-HMM-based audio-visual ASR, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014, pp. 1527-1531.
- (2014) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 1527-1531
- Abdelaziz, A.H.¹ Zeiler, S.² Kolossa, D.³

10
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition, " Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
- (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

11
- 84867337739
- D. Kolossa and R. Haeb-Umbach, Eds Springer
- R. Haeb-Umbach and D. Kolossa, Robust Speech Recognition of Uncertain or Missing Data-Theory and Applications, D. Kolossa and R. Haeb-Umbach, Eds. Springer, 2011.
- (2011) Robust Speech Recognition of Uncertain or Missing Data-Theory and Applications
- Haeb-Umbach, R.¹ Kolossa, D.²

12
- 84994302041
- Turbo automatic speech recognition
- S. Receveur, R. Weiss, and T. Fingscheidt, "Turbo automatic speech recognition, " IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol. 99, pp. 1-1, 2016.
- (2016) IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) , vol.99 , pp. 1
- Receveur, S.¹ Weiss, R.² Fingscheidt, T.³

13
- 0027297425
- Near Shannon limit error-correcting coding and decoding: Turbo codes
- C. Berrou, A. Glavieux, and P. Thitimajshima, "Near Shannon limit error-correcting coding and decoding: Turbo codes, " in IEEE International Conference on Communications, 1993, pp. 1064-1070.
- (1993) IEEE International Conference on Communications , pp. 1064-1070
- Berrou, C.¹ Glavieux, A.² Thitimajshima, P.³

14
- 0030257652
- Near optimum error-correcting coding and decoding: Turbo Codes
- Oct
- C. Berrou and A. Glavieux, "Near optimum error-correcting coding and decoding: Turbo Codes, " IEEE Transactions on Communications, vol. 44, no. 10, pp. 1261-1271, Oct. 1996.
- (1996) IEEE Transactions on Communications , vol.44 , Issue.10 , pp. 1261-1271
- Berrou, C.¹ Glavieux, A.²

15
- 51449122700
- Multimodal information fusion using the iterative decoding algorithm and its application to audio-visual speech recognition
- IEEE
- S. T. Shivappa, B. D. Rao, and M. M. Trivedi, "Multimodal information fusion using the iterative decoding algorithm and its application to audio-visual speech recognition, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2008, pp. 2241-2244.
- (2008) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 2241-2244
- Shivappa, S.T.¹ Rao, B.D.² Trivedi, M.M.³

16
- 84973305073
- Robust audiovisual speech recognition using noise-adaptive linear discriminant analysis
- IEEE
- S. Zeiler, R. Nickel, N. Ma, G. J. Brown, and D. Kolossa, "Robust audiovisual speech recognition using noise-adaptive linear discriminant analysis, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016, pp. 1-2.
- (2016) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 1-2
- Zeiler, S.¹ Nickel, R.² Ma, N.³ Brown, G.J.⁴ Kolossa, D.⁵

17
- 84983190368
- Learning dynamic stream weights for coupled-HMM-based audio-visual speech recognition
- A. H. Abdelaziz, S. Zeiler, and D. Kolossa, "Learning dynamic stream weights for coupled-HMM-based audio-visual speech recognition, " IEEE Transactions on Audio, Speech & Language Processing, vol. 23, no. 5, pp. 863-876, 2015.
- (2015) IEEE Transactions on Audio, Speech & Language Processing , vol.23 , Issue.5 , pp. 863-876
- Abdelaziz, A.H.¹ Zeiler, S.² Kolossa, D.³

18
- 33750368310
- An audiovisual corpus for speech perception and automatic speech recognition
- M. Cooke, J. Barker, S. Cunningham, and X. Shao, "An audiovisual corpus for speech perception and automatic speech recognition, " The Journal of the Acoustical Society of America, vol. 120, no. 5, pp. 2421-2424, 2006.
- (2006) The Journal of the Acoustical Society of America , vol.120 , Issue.5 , pp. 2421-2424
- Cooke, M.¹ Barker, J.² Cunningham, S.³ Shao, X.⁴

19
- 0022890536
- Maximum mutual information estimation of hidden Markov model parameters for speech recognition
- IEEE
- L. Bahl, P. Brown, P. de Souza, and R. Mercer, "Maximum mutual information estimation of hidden Markov model parameters for speech recognition, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1986, pp. 999-999.
- (1986) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 999
- Bahl, L.¹ Brown, P.² De Souza, P.³ Mercer, R.⁴

20
- 51449120120
- Boosted MMI for model and featurespace discriminative training
- IEEE
- D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon, and K. Visweswariah, "Boosted MMI for model and featurespace discriminative training, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2008, pp. 4057-4060.
- (2008) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 4057-4060
- Povey, D.¹ Kanevsky, D.² Kingsbury, B.³ Ramabhadran, B.⁴ Saon, G.⁵ Visweswariah, K.⁶

21
- 84876672166
- Machine learning paradigms for speech recognition: An overview
- L. Deng and X. Li, "Machine learning paradigms for speech recognition: An overview, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, pp. 1060-1089, 2013.
- (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.5 , pp. 1060-1089
- Deng, L.¹ Li, X.²

22
- 0004088857
- Technical Report IZF 1988-3, TNO Institute for Perception, Soesterberg, The Netherlands
- H. J. M. Steeneken and F. W. M. Geurtsen, "Description of the RSG-10 noise database, " in Technical Report IZF 1988-3, TNO Institute for Perception, Soesterberg, The Netherlands, 1988.
- (1988) Description of the RSG-10 Noise Database
- Steeneken, H.J.M.¹ Geurtsen, F.W.M.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.