SCOPUS 정보 검색 플랫폼

Information Fusion

Volumn 5, Issue 2, 2004, Pages 77-80

Robust speech processing using multi-sensor multi-source information fusion - An overview of the state of the art

(2) Aarabi, Parham a Dasarathy, Belur V b

a UNIVERSITY OF TORONTO (Canada)

b NONE

Author keywords

Multisensor information fusion; Robust speech processing; Speech recognition

Indexed keywords

CAMERAS; COMPUTATIONAL COMPLEXITY; COMPUTATIONAL METHODS; DATA PROCESSING; MICROPHONES; SENSOR DATA FUSION; SIGNAL TO NOISE RATIO; SPEECH RECOGNITION;

MULTISENSOR INFORMATION FUSION; ROBUST SPEECH PROCESSING;

SPEECH PROCESSING;

EID: 1842843686 PISSN: 15662535 EISSN: None Source Type: Journal
DOI: 10.1016/j.inffus.2004.02.001 Document Type: Article

Times cited : (12)

References (39)

1
- 0036383219
- Robust speech separation using visually derived speech signals
- P. Aarabi, N.H. Khameneh, Robust speech separation using visually derived speech signals, in: Proceedings of Sensor Fusion: Architectures, Algorithms, and Applications, 2002, pp. 239-247.
- (2002) Proceedings of Sensor Fusion: Architectures, Algorithms, and Applications , pp. 239-247
- Aarabi, P.¹ Khameneh, N.H.²

2
- 0035458007
- Robust sound localization using multi-source audio-visual information fusion
- Aarabi P., Zaky S. Robust sound localization using multi-source audio-visual information fusion. Information Fusion. 3(2):2001;209-223.
- (2001) Information Fusion , vol.3 , Issue.2 , pp. 209-223
- Aarabi, P.¹ Zaky, S.²

3
- 84863153827
- Integrated vision and sound localization
- P. Aarabi, S. Zaky, Integrated vision and sound localization, in: Proceedings of the 3rd International Conference on Information Fusion, 2000, pp. ThB3:21-27.
- (2000) Proceedings of the 3rd International Conference on Information Fusion
- Aarabi, P.¹ Zaky, S.²

4
- 0034215560
- Elucidative fusion systems - An exposition
- Dasarathy B.V. Elucidative fusion systems - an exposition. Information Fusion. 1(1):2000;5-15.
- (2000) Information Fusion , vol.1 , Issue.1 , pp. 5-15
- Dasarathy, B.V.¹

5
- 0002363291
- Nonlinear representation for audio-visual fusion in a noisy-vowel recognition task
- A. Guerin-Dugue, P. Teissier, J.L. Schwartz, J. Herault, Nonlinear representation for audio-visual fusion in a noisy-vowel recognition task, in: Proceedings of the International Conference on Neural Networks and Their Applications, 1997, pp. 31-40.
- (1997) Proceedings of the International Conference on Neural Networks and Their Applications , pp. 31-40
- Guerin-Dugue, A.¹ Teissier, P.² Schwartz, J.L.³ Herault, J.⁴

6
- 0032663816
- Information fusion benefits delineation under off-nominal scenarios
- B.V. Dasarathy, Information fusion benefits delineation under off-nominal scenarios, in: Proceedings of the SPIE Conference on Sensor Fusion: Architectures, Algorithms, and Applications, 1999, pp. 2-13.
- (1999) Proceedings of the SPIE Conference on Sensor Fusion: Architectures, Algorithms, and Applications , pp. 2-13
- Dasarathy, B.V.¹

7
- 0001903528
- Models for audiovisual fusion in a noisy-vowel recognition task
- P. Teissier, J.L. Schwartz, A. Guerin-Dugue, Models for audiovisual fusion in a noisy-vowel recognition task, in: Proceedings of the IEEE Signal Processing Society Workshop on Multimedia Signal Processing, 1997, pp. 37-44.
- (1997) Proceedings of the IEEE Signal Processing Society Workshop on Multimedia Signal Processing , pp. 37-44
- Teissier, P.¹ Schwartz, J.L.² Guerin-Dugue, A.³

8
- 0010630633
- Continuous audio-visual speech recognition
- J. Luettin, S. Dupont, Continuous audio-visual speech recognition, in: Proceedings of the 5th European Conference on Computer Vision, 1998.
- (1998) Proceedings of the 5th European Conference on Computer Vision
- Luettin, J.¹ Dupont, S.²

9
- 0033708494
- Audio-visual intent-to-speak detection for human-computer interaction
- P. deCuetos, C. Neti, A. Senior, Audio-visual intent-to-speak detection for human-computer interaction, in: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 2000, pp. 2373-2376.
- (2000) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing , pp. 2373-2376
- DeCuetos, P.¹ Neti, C.² Senior, A.³

10
- 0032180188
- Adaptive fusion of acoustic and visual sources for automatic speech recognition
- Rogozan A., Deleglise P. Adaptive fusion of acoustic and visual sources for automatic speech recognition. Speech Communication. 26(1-2):1998;149-161.
- (1998) Speech Communication , vol.26 , Issue.1-2 , pp. 149-161
- Rogozan, A.¹ Deleglise, P.²

11
- 0029747053
- Integrating audio and visual information to provide highly robust speech recognition
- M.J. Tomlinson, M.J. Russell, N.M. Brooke, Integrating audio and visual information to provide highly robust speech recognition, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 1996, pp. 821-824.
- (1996) Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 821-824
- Tomlinson, M.J.¹ Russell, M.J.² Brooke, N.M.³

12
- 0029234004
- Nonlinear manifold learning for visual speech recognition
- C. Breglar, S.M. Omohundro, Nonlinear manifold learning for visual speech recognition, in: Proceedings of the IEEE International Conference on Computer Vision, 1995, pp. 494-499.
- (1995) Proceedings of the IEEE International Conference on Computer Vision , pp. 494-499
- Breglar, C.¹ Omohundro, S.M.²

13
- 0025503485
- Neural network models of sensory integration for improved vowel recognition
- Yuhas B.P., Goldstein M.H., Sejnowski T.J., Jenkins R.E. Neural network models of sensory integration for improved vowel recognition. Proceedings of the IEEE. 78(10):1990;1658-1668.
- (1990) Proceedings of the IEEE , vol.78 , Issue.10 , pp. 1658-1668
- Yuhas, B.P.¹ Goldstein, M.H.² Sejnowski, T.J.³ Jenkins, R.E.⁴

14
- 0022228262
- Automatic lip-reading to enhance speech recognition
- E.D. Petajan, Automatic lip-reading to enhance speech recognition, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1985, pp. 40-47.
- (1985) Proceedings of IEEE Conference on Computer Vision and Pattern Recognition , pp. 40-47
- Petajan, E.D.¹

15
- 0030247984
- Computer lip-reading for improved accuracy in automatic speech recognition
- Silsbee P.L., Bovik A.C. Computer lip-reading for improved accuracy in automatic speech recognition. IEEE Transactions on Speech and Audio Processing. 4(5):1996;337-351.
- (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.5 , pp. 337-351
- Silsbee, P.L.¹ Bovik, A.C.²

16
- 34547525365
- Learning dynamic noise models from noisy speech for robust noise recognition
- B.J. Frey, T. Kristjansson, L. Deng, A. Acero, Learning dynamic noise models from noisy speech for robust noise recognition, in: Proceedings of Neural Information Processing Systems (NIPS), 2001.
- (2001) Proceedings of Neural Information Processing Systems (NIPS)
- Frey, B.J.¹ Kristjansson, T.² Deng, L.³ Acero, A.⁴

17
- 0042826822
- Independent component analysis: Algorithms and applications
- Hyvrinen A., Oja E. Independent component analysis: algorithms and applications. Neural Networks. 13(4-5):2000;411-430.
- (2000) Neural Networks , vol.13 , Issue.4-5 , pp. 411-430
- Hyvrinen, A.¹ Oja, E.²

18
- 0141479057
- Robust digit recognition using phase-dependent time-frequency masking
- G. Shi, P. Aarabi, Robust digit recognition using phase-dependent time-frequency masking, in: Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, 2003, pp. I:684-687.
- (2003) Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing
- Shi, G.¹ Aarabi, P.²

19
- 0030701369
- A robust method for speech signal time-delay estimation in reverberant rooms
- M.S. Brandstein, H. Silverman, A robust method for speech signal time-delay estimation in reverberant rooms, in: Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, 1997, pp. I:375-378.
- (1997) Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing
- Brandstein, M.S.¹ Silverman, H.²

20
- 0029411030
- An information maximization approach to blind separation and blind deconvolution
- Bell A., Sejnowski T. An information maximization approach to blind separation and blind deconvolution. Neural Computation. 7:1995;1129-1159.
- (1995) Neural Computation , vol.7 , pp. 1129-1159
- Bell, A.¹ Sejnowski, T.²

21
- 84908591137
- Robust variational speech separation using fewer microphones than speakers
- S. Rennie, P. Aarabi, T. Kristjansson, B. Frey, K. Achan, Robust variational speech separation using fewer microphones than speakers, in: Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, 2003, pp. I:88-91.
- (2003) Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing
- Rennie, S.¹ Aarabi, P.² Kristjansson, T.³ Frey, B.⁴ Achan, K.⁵

22
- 0037445320
- The fusion of distributed microphone arrays for sound localization
- Aarabi P. The fusion of distributed microphone arrays for sound localization. EURASIP Journal of Applied Signal Processing Special Issue on Sensor Networks. 2003(4):2003;338-347.
- (2003) EURASIP Journal of Applied Signal Processing Special Issue on Sensor Networks , vol.2003 , Issue.4 , pp. 338-347
- Aarabi, P.¹

23
- 0003770986
- Comparing models for audiovisual fusion in a noisy-vowel recognition task
- Teissier P., Robert-Ribes J., Schwartz J.L., Guerin-Dugue A. Comparing models for audiovisual fusion in a noisy-vowel recognition task. IEEE Transactions on Speech and Audio Processing. 7(6):1999;629-642.
- (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.6 , pp. 629-642
- Teissier, P.¹ Robert-Ribes, J.² Schwartz, J.L.³ Guerin-Dugue, A.⁴

24
- 0037469886
- Algorithms for acoustic localization based on microphone array in service robotics
- Enzo M., Massimiliano N., Gianni V. Algorithms for acoustic localization based on microphone array in service robotics. Robotics and Autonomous Systems. 42(2):2003;69-88.
- (2003) Robotics and Autonomous Systems , vol.42 , Issue.2 , pp. 69-88
- Enzo, M.¹ Massimiliano, N.² Gianni, V.³

25
- 4544290191
- Recent advances in the automatic recognition of audiovisual speech
- Potamianos G., Neti C., Gravier G., Garg A., Senior A.W. Recent advances in the automatic recognition of audiovisual speech. Proceedings of the IEEE. 91(9):2003;1306-1326.
- (2003) Proceedings of the IEEE , vol.91 , Issue.9 , pp. 1306-1326
- Potamianos, G.¹ Neti, C.² Gravier, G.³ Garg, A.⁴ Senior, A.W.⁵

26
- 0036295989
- Audio-visual speech modeling using coupled hidden Markov models
- S.M. Chu, T.S. Huang, Audio-visual speech modeling using coupled hidden Markov models, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2002, pp. 2009-2012.
- (2002) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 2009-2012
- Chu, S.M.¹ Huang, T.S.²

27
- 0034842451
- Weighting schemes for audio-visual fusion in speech recognition
- H. Glotin, D. Vergyr, C. Neti, G. Potamianos, J. Luettin, Weighting schemes for audio-visual fusion in speech recognition, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001, pp. 173-176.
- (2001) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 173-176
- Glotin, H.¹ Vergyr, D.² Neti, C.³ Potamianos, G.⁴ Luettin, J.⁵

28
- 0034842342
- Asynchronous stream modeling for large vocabulary audio-visual speech recognition
- J. Luettin, G. Potamianos, C. Neti, Asynchronous stream modeling for large vocabulary audio-visual speech recognition, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001, pp. 169-172.
- (2001) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 169-172
- Luettin, J.¹ Potamianos, G.² Neti, C.³

29
- 0034848499
- Optimal weighting of posteriors for audio-visual speech recognition
- M. Heckmann, F. Berthommier, K. Kroschel, Optimal weighting of posteriors for audio-visual speech recognition, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001, pp. 161-164.
- (2001) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 161-164
- Heckmann, M.¹ Berthommier, F.² Kroschel, K.³

30
- 34147179125
- Combining acoustic and visual classifiers for the recognition of spoken sentences
- Y. Keren, J. Xiaoyi, H. Bunke, Combining acoustic and visual classifiers for the recognition of spoken sentences, in: Proceedings of the 15th International Conference on Pattern Recognition, 2000, pp. 491-494.
- (2000) Proceedings of the 15th International Conference on Pattern Recognition , pp. 491-494
- Keren, Y.¹ Xiaoyi, J.² Bunke, H.³

31
- 0033677156
- Joint audio-video object localization using a recursive multi-state multi-sensor estimator
- N. Strobel, S. Spors, R. Rabenstein, Joint audio-video object localization using a recursive multi-state multi-sensor estimator, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2000, pp. 2397-2400.
- (2000) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 2397-2400
- Strobel, N.¹ Spors, S.² Rabenstein, R.³

32
- 0034270644
- Audio-visual speech modeling for continuous speech recognition
- Dupont S., Luettin J. Audio-visual speech modeling for continuous speech recognition. IEEE Transactions on Multimedia. 2(3):2000;141-151.
- (2000) IEEE Transactions on Multimedia , vol.2 , Issue.3 , pp. 141-151
- Dupont, S.¹ Luettin, J.²

33
- 0036154801
- Speakers' direction finding using estimated time delays in the frequency domain
- Berdugo B., Rosenhouse J., Azhari H. Speakers' direction finding using estimated time delays in the frequency domain. Signal Processing. 82(1):2002;19-30.
- (2002) Signal Processing , vol.82 , Issue.1 , pp. 19-30
- Berdugo, B.¹ Rosenhouse, J.² Azhari, H.³

34
- 0038381727
- Audio-visual speech recognition based on optimized product HMMs and GMM based-MCE-GPD stream weight estimation
- Kumatani K., Nakamura S. Audio-visual speech recognition based on optimized product HMMs and GMM based-MCE-GPD stream weight estimation. IEICE Transactions on Information and Systems E. 86-D(3):2003;454-463.
- (2003) IEICE Transactions on Information and Systems E , vol.86 D , Issue.3 , pp. 454-463
- Kumatani, K.¹ Nakamura, S.²

35
- 0036058193
- Real-time speaker localization and speech separation by audio-visual integration
- K. Nakadai, K. Hidai, H.G. Okuno, H. Kitano, Real-time speaker localization and speech separation by audio-visual integration, in: Proceedings of the 2002 IEEE International Conference on Robotics and Automation, 2002, pp. 1043-1049.
- (2002) Proceedings of the 2002 IEEE International Conference on Robotics and Automation , pp. 1043-1049
- Nakadai, K.¹ Hidai, K.² Okuno, H.G.³ Kitano, H.⁴

36
- 0035791289
- Using likelihood L-statistics to measure confidence in audio-visual speech recognition
- A. Ghosh, A. Verma, A. Sarkar, Using likelihood L-statistics to measure confidence in audio-visual speech recognition, in: Proceedings of the 2001 IEEE Fourth Workshop on Multimedia Signal Processing, 2001, pp. 27-32.
- (2001) Proceedings of the 2001 IEEE Fourth Workshop on Multimedia Signal Processing , pp. 27-32
- Ghosh, A.¹ Verma, A.² Sarkar, A.³

37
- 0034853041
- Hierarchical discriminant features for audio-visual LVCSR
- G. Potamianos, J. Luettin, C. Neti, Hierarchical discriminant features for audio-visual LVCSR, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001, pp. 165-168.
- (2001) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 165-168
- Potamianos, G.¹ Luettin, J.² Neti, C.³

38
- 1842830692
- Improved speech recognition using adaptive audio-visual fusion via a stochastic secondary classifier
- S. Lucey, S. Sridharan, V. Chandran, Improved speech recognition using adaptive audio-visual fusion via a stochastic secondary classifier, in: Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2001, pp. 551-554.
- (2001) Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing , pp. 551-554
- Lucey, S.¹ Sridharan, S.² Chandran, V.³

39
- 85049351181
- Audiovisual speech enhancement: New advances using multi-layer perceptrons
- L. Girin, L. Varin, G. Feng, J.L. Schwartz, Audiovisual speech enhancement: new advances using multi-layer perceptrons, in: Proceedings of the IEEE Second Workshop on Multimedia Signal Processing, 1998, pp. 77-82.
- (1998) Proceedings of the IEEE Second Workshop on Multimedia Signal Processing , pp. 77-82
- Girin, L.¹ Varin, L.² Feng, G.³ Schwartz, J.L.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.