SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 16, Issue 2, 2002, Pages 205-223

Hidden markov model training with contaminated speech material for distant-talking speech recognition

(4) Matassoni, Marco a Omologo, Maurizio a Giuliani, Diego a Svaizer, Piergiorgio a

a FONDAZIONE BRUNO KESSLER (Italy)

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTIC NOISE; ACOUSTIC NOISE MEASUREMENT; ADAPTIVE ALGORITHMS; MARKOV PROCESSES; SPEECH INTELLIGIBILITY; SPEECH SYNTHESIS;

CONTAMINATED SPEECH MATERIAL; DISTANT TALKING SPEECH RECOGNITION; HIDDEN MARKOV MODEL; INCREMENTAL ADAPTATION; SPEECH ACTIVITY DETECTION;

SPEECH RECOGNITION;

EID: 0036556170 PISSN: 08852308 EISSN: None Source Type: Journal
DOI: 10.1006/csla.2002.0191 Document Type: Article

Times cited : (41)

References (54)

1
- 0004319970
- Kluwer Academic Publishers, Boston
- Acero, A. (1992). Acoustical and Environmental Robustness in Automatic Speech Recognition. Kluwer Academic Publishers, Boston.
- (1992) Acoustical and Environmental Robustness in Automatic Speech Recognition
- Acero, A.¹

2
- 85009113852
- HMM adaptation using vector Taylor series for noisy speech recognition
- Acero, A., Deng, L., Kristjansson, T. & Zhang, J. (2000). HMM adaptation using vector Taylor series for noisy speech recognition. Proceedings of International Conference on Spoken Language Processing, volume 3, pp. 869-872.
- (2000) Proceedings of International Conference on Spoken Language Processing , vol.3 , pp. 869-872
- Acero, A.¹ Deng, L.² Kristjansson, T.³ Zhang, J.⁴

3
- 0001370735
- Speaker independent continuous speech recognition using an acoustic-phonetic Italian corpus
- Angelini, B., Brugnara, F., Falavigna, D., Giuliani, D., Gretter, R. & Omologo, M. (1994). Speaker independent continuous speech recognition using an acoustic-phonetic Italian corpus. Proceedings of International Conference on Spoken Language Processing, volume 3, pp. 1391-1394.
- (1994) Proceedings of International Conference on Spoken Language Processing , vol.3 , pp. 1391-1394
- Angelini, B.¹ Brugnara, F.² Falavigna, D.³ Giuliani, D.⁴ Gretter, R.⁵ Omologo, M.⁶

4
- 0019565863
- Computer-generated pulse signal applied for sound measurement
- Aoshima, M. (1981). Computer-generated pulse signal applied for sound measurement. Journal of the Acoustical Society of America, 69, 1484-1488.
- (1981) Journal of the Acoustical Society of America , vol.69 , pp. 1484-1488
- Aoshima, M.¹

5
- 0016067897
- Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
- Atal, B. S. (1974). Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. Journal of the Acoustical Society of America, 55, 1304-1312.
- (1974) Journal of the Acoustical Society of America , vol.55 , pp. 1304-1312
- Atal, B.S.¹

6
- 0003152968
- Speech enhancement in the 1980s: Noise suppression with pattern matching
- (S. Furui and M. M. Sondhi, eds), chapter 10
- Boll, S. (1992). Speech enhancement in the 1980s: noise suppression with pattern matching. In Advances in Speech Signal Processing, (S. Furui and M. M. Sondhi, eds), chapter 10.
- (1992) Advances in Speech Signal Processing
- Boll, S.¹

7
- 0003980102
- Brandstein, M. & Ward, D. (eds); Springer
- Brandstein, M. & Ward, D. (eds) (2001). Microphone Arrays-Signal Processing Techniques and Applications. Springer.
- (2001) Microphone Arrays-Signal Processing Techniques and Applications

8
- 0005809619
- Microphone arrays and neural networks for robust speech recognition
- Che, C., Lin, Q., Pearson, J., de Vries, B. & Flanagan, J. (1994). Microphone arrays and neural networks for robust speech recognition. Proceedings of the Spoken Language Technology Workshop, pp. 321-326.
- (1994) Proceedings of the Spoken Language Technology Workshop , pp. 321-326
- Che, C.¹ Lin, Q.² Pearson, J.³ De Vries, B.⁴ Flanagan, J.⁵

9
- 0027229711
- Influence of back-ground noise and microphone on the performance of the IBM Tangora speech recognition system
- Das, S., Bakis, R., Nadas, A., Nahamoo, D. & Picheny, M. (1993). Influence of back-ground noise and microphone on the performance of the IBM Tangora speech recognition system. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, volume 2, pp. 71-74.
- (1993) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , vol.2 , pp. 71-74
- Das, S.¹ Bakis, R.² Nadas, A.³ Nahamoo, D.⁴ Picheny, M.⁵

10
- 0020795461
- On the effects of varying filter bank parameters on isolated word recognition
- Dautrich, B. A., Rabiner, L. R. & Martin, T. B. (1983). On the effects of varying filter bank parameters on isolated word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 31, 793-897.
- (1983) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.31 , pp. 793-897
- Dautrich, B.A.¹ Rabiner, L.R.² Martin, T.B.³

11
- 0003969334
- Academic Press, London
- De Mori, R. (1998). Spoken Dialogues with Computers. Academic Press, London.
- (1998) Spoken Dialogues with Computers
- De Mori, R.¹

12
- 0026881830
- Gain-adapted hidden Markov models for recognition of clean and noisy speech
- Ephraim, Y. (1992). Gain-adapted hidden Markov models for recognition of clean and noisy speech. IEEE Transactions on Signal Processing, 40, 1303-1316.
- (1992) IEEE Transactions on Signal Processing , vol.40 , pp. 1303-1316
- Ephraim, Y.¹

13
- 0030395054
- Beamforming microphone arrays for speech acquisition in noisy environments
- Fischer, S. & Simmer, K. U. (1996). Beamforming microphone arrays for speech acquisition in noisy environments. Speech Communication, Special Issue on Acoustic Echo and Noise Control.
- (1996) Speech Communication, Special Issue on Acoustic Echo and Noise Control
- Fischer, S.¹ Simmer, K.U.²

14
- 0003027176
- Autodirective microphone systems
- Flanagan, J. L., Berkley, D. A., Elko, G. W., West, J. E. & Sondhi, M. M. (1991). Autodirective microphone systems. Acustica, 75, 58-71.
- (1991) Acustica , vol.75 , pp. 58-71
- Flanagan, J.L.¹ Berkley, D.A.² Elko, G.W.³ West, J.E.⁴ Sondhi, M.M.⁵

15
- 0012265819
- Robust speech recognition under adverse conditions
- Furui, S. (1992). Robust speech recognition under adverse conditions. Proceedings of the ESCA Workshop on Speech Processing in Adverse Conditions, pp. 31-42.
- (1992) Proceedings of the ESCA Workshop on Speech Processing in Adverse Conditions , pp. 31-42
- Furui, S.¹

16
- 0002960982
- Recent advances in robust speech recognition
- Furui, S. (1997). Recent advances in robust speech recognition. Proceedings of the ESCA-NATO Workshop on Robust Speech Recognition for Unknown Communication Channels, pp. 11-20.
- (1997) Proceedings of the ESCA-NATO Workshop on Robust Speech Recognition for Unknown Communication Channels , pp. 11-20
- Furui, S.¹

17
- 0003671941
- Model-based techniques for noise robust speech recognition
- PhD Thesis, Cambridge University
- Gales, M. J. F. (1995). Model-based techniques for noise robust speech recognition. PhD Thesis, Cambridge University.
- (1995)
- Gales, M.J.F.¹

18
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- Gales, M. J. F. (1998). Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech and Language, 12, 75-98.
- (1998) Computer Speech and Language , vol.12 , pp. 75-98
- Gales, M.J.F.¹

19
- 0030263447
- Mean and variance adaptation within the MLLR framework
- Gales, M. J. F. & Woodland, P. C. (1996). Mean and variance adaptation within the MLLR framework. Computer Speech and Language, 10, 249-264.
- (1996) Computer Speech and Language , vol.10 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

20
- 0030245128
- Robust speech recognition using parallel model combination
- Gales, M. J. F. & Young, S. J. (1996). Robust speech recognition using parallel model combination. IEEE Transactions on Speech and Audio Processing, 4, 352-359.
- (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , pp. 352-359
- Gales, M.J.F.¹ Young, S.J.²

21
- 0003026375
- The LIMSI 1995 Hub3 system
- Gauvain, J. L., Lamel, L., Adda, G. & Matrouf, D. (1996). The LIMSI 1995 Hub3 system. In DARPA Speech Recognition Workshop, Arden House. pp. 105-111.
- (1996) DARPA Speech Recognition Workshop, Arden House , pp. 105-111
- Gauvain, J.L.¹ Lamel, L.² Adda, G.³ Matrouf, D.⁴

22
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Gauvain, J. L. & Lee, C. H. (1994). Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Transactions on Speech and Audio Processing, 2, 291-298.
- (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.H.²

23
- 84908169893
- Use of different microphone array configurations for hands-free speech recognition in noisy and reverberant environment
- Giuliani, D., Matassoni, M., Omologo, M. & Svaizer, P. (1997). Use of different microphone array configurations for hands-free speech recognition in noisy and reverberant environment. Proceedings of the EUROSPEECH, volume 1, pp. 347-350.
- (1997) Proceedings of the EUROSPEECH , vol.1 , pp. 347-350
- Giuliani, D.¹ Matassoni, M.² Omologo, M.³ Svaizer, P.⁴

24
- 0032667502
- Training of HMM with filtered speech material for hands-free speech recognition
- Giuliani, D., Matassoni, M., Omologo, M. & Svaizer, P. (1999a). Training of HMM with filtered speech material for hands-free speech recognition. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, volume 1, pp. 449-452.
- (1999) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , vol.1 , pp. 449-452
- Giuliani, D.¹ Matassoni, M.² Omologo, M.³ Svaizer, P.⁴

25
- 0012330735
- Use of filtered clean speech for robust HMM training
- Giuliani, D., Matassoni, M., Omologo, M. & Svaizer, P. (1999b). Use of filtered clean speech for robust HMM training. Proceedings of the Workshop on Robust Methods for Speech Recognition in Adverse Conditions, pp. 99-102.
- (1999) Proceedings of the Workshop on Robust Methods for Speech Recognition in Adverse Conditions , pp. 99-102
- Giuliani, D.¹ Matassoni, M.² Omologo, M.³ Svaizer, P.⁴

26
- 0029288202
- Speech recognition in noisy environments: A survey
- Gong, Y. (1995). Speech recognition in noisy environments: a survey. Speech Communication, 16, 261-291.
- (1995) Speech Communication , vol.16 , pp. 261-291
- Gong, Y.¹

27
- 0003668769
- Haykin, S. (ed.); Prentice Hall, Englewood Cliffs, NJ
- Haykin, S. (ed.) (1995). Advances in Spectrum Analysis and Array Processing. Prentice Hall, Englewood Cliffs, NJ.
- (1995) Advances in Spectrum Analysis and Array Processing

28
- 0000413636
- Robust time-domain processing of broad-band microphone array data
- Hoffman, M. W. & Buckley, K. M. (1995). Robust time-domain processing of broad-band microphone array data. IEEE Transactions on Speech and Audio Processing, 3, 193-203.
- (1995) IEEE Transactions on Speech and Audio Processing , vol.3 , pp. 193-203
- Hoffman, M.W.¹ Buckley, K.M.²

29
- 0027465491
- The Lombard reflex and its role on human listeners and automatic speech recognizers
- Junqua, J. C. (1993). The Lombard reflex and its role on human listeners and automatic speech recognizers. Journal of the Acoustical Society of America, 1, 510-524.
- (1993) Journal of the Acoustical Society of America , vol.1 , pp. 510-524
- Junqua, J.C.¹

30
- 0003770709
- Kluwer Academic Publishers, Boston
- Junqua, J. C. & Haton, J. P. (1996). Robustness in Automatic Speech Recognition. Kluwer Academic Publishers, Boston.
- (1996) Robustness in Automatic Speech Recognition
- Junqua, J.C.¹ Haton, J.P.²

31
- 0022929808
- Adaptive microphone-array system for noise reduction
- Kaneda, Y. & Ohga, J. (1986). Adaptive microphone-array system for noise reduction. IEEE Transactions on Acoustics, Speech and Signal Processing, 34, 1391-1400.
- (1986) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.34 , pp. 1391-1400
- Kaneda, Y.¹ Ohga, J.²

32
- 0032651723
- Integrated bias removal techniques for robust speech recognition
- Lawrence, C. & Rahim, M. (1999). Integrated bias removal techniques for robust speech recognition. Computer Speech and Language, 13, 283-298.
- (1999) Computer Speech and Language , vol.13 , pp. 283-298
- Lawrence, C.¹ Rahim, M.²

33
- 0032140546
- On stochastic feature and model compensation approaches to robust speech recognition
- Lee, C. H. (1998). On stochastic feature and model compensation approaches to robust speech recognition. Speech Communication, 25, 29-47.
- (1998) Speech Communication , vol.25 , pp. 29-47
- Lee, C.H.¹

34
- 0003770711
- Kluwer Academic Publishers, Boston
- Lee, C. H., Soong, F. K. & Paliwal, K. K. (1996). Automatic Speech and Speaker Recognition. Kluwer Academic Publishers, Boston.
- (1996) Automatic Speech and Speaker Recognition
- Lee, C.H.¹ Soong, F.K.² Paliwal, K.K.³

35
- 85135194048
- Flexible speaker adaptation for large vocabulary speech recognition
- Leggetter, C. J. & Woodland, P. C. (1995a). Flexible speaker adaptation for large vocabulary speech recognition. Proceedings of EUROSPEECH, volume, 2, pp. 1155-1158.
- (1995) Proceedings of EUROSPEECH , vol.2 , pp. 1155-1158
- Leggetter, C.J.¹ Woodland, P.C.²

36
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- Leggetter, C. J. & Woodland, P. C. (1995b). Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech and Language, 9, 171-185.
- (1995) Computer Speech and Language , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

37
- 0003491370
- Prentice-Hall
- Lim, J. S. (1983). Speech Enhancement. Prentice-Hall.
- (1983) Speech Enhancement
- Lim, J.S.¹

38
- 0024753593
- Speech recognition using noise-adaptive prototypes
- Nadas, A., Nahamoo, D. & Picheny, M. (1989). Speech recognition using noise-adaptive prototypes. IEEE Transactions on Acoustics, Speech and Signal Processing, 37, 1495-1503.
- (1989) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.37 , pp. 1495-1503
- Nadas, A.¹ Nahamoo, D.² Picheny, M.³

39
- 0012266729
- Microphone-independent robust signal processing using probabilistic optimum filtering
- Neumeyer, L. & Weintraub, M. (1994). Microphone-independent robust signal processing using probabilistic optimum filtering. Proceedings of the Spoken Language Technology Workshop, pp. 315-320.
- (1994) Proceedings of the Spoken Language Technology Workshop , pp. 315-320
- Neumeyer, L.¹ Weintraub, M.²

40
- 0031144484
- Use of the cross-power-spectrum phase in acoustic event location
- Omologo, M. & Svaizer, P. (1997). Use of the cross-power-spectrum phase in acoustic event location. IEEE Transactions on Speech and Audio Processing, 5, 288-292.
- (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , pp. 288-292
- Omologo, M.¹ Svaizer, P.²

41
- 0032142014
- Environmental conditions and acoustic transduction in hands-free speech recognition
- Omologo, M., Svaizer, P. & Matassoni, M. (1998) Environmental conditions and acoustic transduction in hands-free speech recognition. Speech Communication, 25, 75-95.
- (1998) Speech Communication , vol.25 , pp. 75-95
- Omologo, M.¹ Svaizer, P.² Matassoni, M.³

42
- 85079234583
- (April). On the limitations of cepstral features in noise
- Openshaw, J. P. & Mason, J. S. (April 1994). On the limitations of cepstral features in noise. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, volume 1, pp. 49-52.
- (1994) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , vol.1 , pp. 49-52
- Openshaw, J.P.¹ Mason, J.S.²

43
- 0004034057
- IEEE Press, Piscataway, NJ
- O'Shaughnessy, D. (2000). Speech Communications. IEEE Press, Piscataway, NJ.
- (2000) Speech Communications
- O'Shaughnessy, D.¹

44
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of IEEE, 77, 257-286.
- (1989) Proceedings of IEEE , vol.77 , pp. 257-286
- Rabiner, L.R.¹

45
- 0004244302
- Prentice-Hall
- Rabiner, L. R. & Juang, B. H. (1993). Fundamentals of Speech Recognition. Prentice-Hall.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.R.¹ Juang, B.H.²

46
- 0029769867
- Signal bias removal by maximum likelihood estimation for robust telephone speech recognition
- Rahim, M. & Juang, B. H. (1996). Signal bias removal by maximum likelihood estimation for robust telephone speech recognition. IEEE Transactions on Speech and Audio Processing, 4, 19-30.
- (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , pp. 19-30
- Rahim, M.¹ Juang, B.H.²

47
- 0028420014
- Integrated models of signals and background noise with application to speaker identification in noise
- Rose, R. C., Hofstetter, E. M. & Reynolds, D. A. (1994). Integrated models of signals and background noise with application to speaker identification in noise. IEEE Transactions on Speech and Audio Processing, 2, 245-257.
- (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , pp. 245-257
- Rose, R.C.¹ Hofstetter, E.M.² Reynolds, D.A.³

48
- 0034274946
- Noise-compensated hidden Markov models
- Sanches, I. (2000). Noise-compensated hidden Markov models. IEEE Transactions on Speech and Audio Processing, 8, 533-540.
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , pp. 533-540
- Sanches, I.¹

49
- 0030149866
- A maximum-likelihood approach to stochastic matching for robust speech recognition
- Sankar, A. & Lee, C. H. (1996). A maximum-likelihood approach to stochastic matching for robust speech recognition. IEEE Transactions on Speech and Audio Processing, 4, 190-202.
- (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , pp. 190-202
- Sankar, A.¹ Lee, C.H.²

50
- 0028812427
- An optimum computer-generated pulse signal suitable for the measurement of very long impulse responses
- Suzuki, Y., Asano, F., Kim, H. Y. & Sone, T. (1995). An optimum computer-generated pulse signal suitable for the measurement of very long impulse responses. Journal of the Acoustical Society of America, 97, 1119-1123.
- (1995) Journal of the Acoustical Society of America , vol.97 , pp. 1119-1123
- Suzuki, Y.¹ Asano, F.² Kim, H.Y.³ Sone, T.⁴

51
- 0003747957
- Wiley and Teubner
- Vaseghi, S. V. (1996). Advanced Signal Processing and Digital Noise Reduction. Wiley and Teubner.
- (1996) Advanced Signal Processing and Digital Noise Reduction
- Vaseghi, S.V.¹

52
- 0002452931
- The HTK large vocabulary recognition system for the 1995 ARPA H3 task
- (February)
- Woodland, P. C., Gales, M. J. F., Pye, D. & Valtchev, V. (February 1996). The HTK large vocabulary recognition system for the 1995 ARPA H3 task. In DARPA Speech Recognition Workshop, Arden House. pp. 99-104.
- (1996) DARPA Speech Recognition Workshop, Arden House , pp. 99-104
- Woodland, P.C.¹ Gales, M.J.F.² Pye, D.³ Valtchev, V.⁴

53
- 0023773764
- A microphone array with adaptive post-filtering for noise reduction in reverberant rooms
- Zelinski, R. (1988). A microphone array with adaptive post-filtering for noise reduction in reverberant rooms. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, volume, 5, pp. 2578-2581.
- (1988) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , vol.5 , pp. 2578-2581
- Zelinski, R.¹

54
- 0001459635
- Frequency-domain maximum likelihood estimation for automatic speech recognition in additive and convolutive noises
- Zhao, Y. (2000). Frequency-domain maximum likelihood estimation for automatic speech recognition in additive and convolutive noises. IEEE Transactions on Speech and Audio Processing, 8, 255-266.
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , pp. 255-266
- Zhao, Y.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.