SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 43, Issue 1-2, 2004, Pages 123-142

Techniques for handling convolutional distortion with 'missing data' automatic speech recognition

(3) Palomäki, Kalle J a,b,c Brown, Guy J b Barker, Jon P b

a AALTO UNIVERSITY (Finland)

b UNIVERSITY OF SHEFFIELD (United Kingdom)

c UNIVERSITY OF HELSINKI (Finland)

Author keywords

Missing data; Reverberation; Spectral distortion; Spectral normalisation; Speech recognition

Indexed keywords

AUTOMATION; CONVOLUTION; DATA REDUCTION; MARKOV PROCESSES; MATHEMATICAL MODELS; REVERBERATION; SPEECH ANALYSIS; SPEECH COMMUNICATION;

MISSING DATA; SPECTRAL DISTORTIONS; SPECTRAL NORMALIZATION; SPEECH SPECTRUM;

SPEECH RECOGNITION;

EID: 2942539074 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2004.02.005 Document Type: Article

Times cited : (59)

References (59)

1
- 0141702085
- Environmental sniffing: Noise knowledge estimation for robust speech systems
- Akbacak M., Hansen J.H.L. Environmental sniffing: noise knowledge estimation for robust speech systems. Proc. ICASSP-2003. II:2003;113-116.
- (2003) Proc. ICASSP-2003 II , pp. 113-116
- Akbacak, M.¹ Hansen, J.H.L.²

2
- 0003773484
- PhD thesis, Oregon graduate institute
- Avendano, C., 1997. Temporal processing of speech in a time-feature space, PhD thesis, Oregon graduate institute.
- (1997) Temporal Processing of Speech in a Time-feature Space
- Avendano, C.¹

3
- 2142812604
- The perception of speech under adverse acoustic conditions
- S. Greenberg, & W. Ainsworth. Springer-Verlag (Springer Handbook of Auditory Research)
- Assmann P., Summerfield Q. The perception of speech under adverse acoustic conditions. Greenberg S., Ainsworth W. Speech Processing in the Auditory System (Springer Handbook of Auditory Research, Vol. 18). 2003;Springer-Verlag.
- (2003) Speech Processing in the Auditory System , vol.18
- Assmann, P.¹ Summerfield, Q.²

4
- 0016067897
- Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
- Atal B.S. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55:1974;1304-1312.
- (1974) J. Acoust. Soc. Am. , vol.55 , pp. 1304-1312
- Atal, B.S.¹

5
- 85009096997
- Decoding speech in the presence of other sound sources
- Barker J., Cooke M.P., Ellis D.P.W. Decoding speech in the presence of other sound sources. Proc. ICSLP-2000. IV:2000;270-273.
- (2000) Proc. ICSLP-2000 IV , pp. 270-273
- Barker, J.¹ Cooke, M.P.² Ellis, D.P.W.³

6
- 85009063707
- Soft decisions in missing data techniques for robust automatic speech recognition
- Barker J., Josifovski L., Cooke M.P., Green P.D. Soft decisions in missing data techniques for robust automatic speech recognition. Proc. ICSLP-2000. I:2000;373-376.
- (2000) Proc. ICSLP-2000 I , pp. 373-376
- Barker, J.¹ Josifovski, L.² Cooke, M.P.³ Green, P.D.⁴

7
- 85009106519
- Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
- Barker J., Cooke M.P., Green P.D. Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise. Proc. Eurospeech-2001. 2001;213-217.
- (2001) Proc. Eurospeech-2001 , pp. 213-217
- Barker, J.¹ Cooke, M.P.² Green, P.D.³

8
- 2942624890
- Linking auditory scene analysis and robust ASR by missing data techniques
- Stratford-upon-Avon, UK, 2nd-3rd April
- Barker, J., Green, P.D., Cooke, M.P., 2001b. Linking auditory scene analysis and robust ASR by missing data techniques, In: Proceedings of the Workshop on Innovations in Speech Processing (WISP-2001), Stratford-upon-Avon, UK, 2nd-3rd April.
- (2001) Proceedings of the Workshop on Innovations in Speech Processing (WISP-2001)
- Barker, J.¹ Green, P.D.² Cooke, M.P.³

9
- 0022479342
- Predictors of speech intelligibility in rooms
- Bradley J.S. Predictors of speech intelligibility in rooms. J. Acoust. Soc. Am. 80:1986;837-845.
- (1986) J. Acoust. Soc. Am. , vol.80 , pp. 837-845
- Bradley, J.S.¹

10
- 0003684441
- Cambridge, MA: MIT Press
- Bregman A.S. Auditory Scene Analysis. 1990;MIT Press, Cambridge, MA.
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

11
- 0028531926
- Computational auditory scene analysis
- Brown G.J., Cooke M.P. Computational auditory scene analysis. Comp. Speech Lang. 8:1994;297-336.
- (1994) Comp. Speech Lang. , vol.8 , pp. 297-336
- Brown, G.J.¹ Cooke, M.P.²

12
- 0034850070
- A neural oscillator sound separator for missing data speech recognition
- Brown G.J., Barker J., Wang D.L. A neural oscillator sound separator for missing data speech recognition. Proc. IJCNN-2001. 2001;2907-2912.
- (2001) Proc. IJCNN-2001 , pp. 2907-2912
- Brown, G.J.¹ Barker, J.² Wang, D.L.³

13
- 85135196323
- New telephone speech corpora at CSLU
- Cole R.A., Noel M., Lander T., Durham T. New telephone speech corpora at CSLU. Proc. Eurospeech-1995. I:1995;821-824.
- (1995) Proc. Eurospeech-1995 I , pp. 821-824
- Cole, R.A.¹ Noel, M.² Lander, T.³ Durham, T.⁴

14
- 0003479143
- Cambridge, UK: Cambridge University Press
- Cooke M.P. Modelling Auditory Processing and Organization. 1993;Cambridge University Press, Cambridge, UK.
- (1993) Modelling Auditory Processing and Organization
- Cooke, M.P.¹

15
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- Cooke M.P., Green P.D., Josifovski L., Vizinho A. Robust automatic speech recognition with missing and unreliable acoustic data. Speech Comm. 34:2001;267-285.
- (2001) Speech Comm. , vol.34 , pp. 267-285
- Cooke, M.P.¹ Green, P.D.² Josifovski, L.³ Vizinho, A.⁴

16
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Davis S.P., Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. ASSP-28:1980;357-366.
- (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-38 , pp. 357-366
- Davis, S.P.¹ Mermelstein, P.²

17
- 0036291376
- Uncertainty decoding with SPLICE for noise robust speech recognition
- Droppo J., Acero A., Deng L. Uncertainty decoding with SPLICE for noise robust speech recognition. Proc. ICASSP-2002. I:2002;57-60.
- (2002) Proc. ICASSP-2002 I , pp. 57-60
- Droppo, J.¹ Acero, A.² Deng, L.³

18
- 0027957839
- Effects of temporal envelope smearing on speech reception
- Drullman R., Festen J.M., Plomp R. Effects of temporal envelope smearing on speech reception. J. Acoust. Soc. Amer. 95:1994;1053-1064.
- (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 1053-1064
- Drullman, R.¹ Festen, J.M.² Plomp, R.³

19
- 0002768123
- Assessing local noise level estimation methods
- Dupont, S., Ris C., 1999. Assessing local noise level estimation methods. In: Proc. of Workshop on Robust Methods for Speech Recognition in Adverse Environments, pp. 115-118.
- (1999) Proc. of Workshop on Robust Methods for Speech Recognition in Adverse Environments , pp. 115-118
- Dupont, S.¹ Ris, C.²

20
- 0141520490
- Audio context awareness - Acoustic modeling and perceptual evaluation
- Eronen A., Tuomi J., Klapuri A., Fagerlund S., Sorsa T., Lorho G., Huopaniemi J. Audio context awareness - acoustic modeling and perceptual evaluation. Proc. ICASSP-2003. V:2003;529-532.
- (2003) Proc. ICASSP-2003 V , pp. 529-532
- Eronen, A.¹ Tuomi, J.² Klapuri, A.³ Fagerlund, S.⁴ Sorsa, T.⁵ Lorho, G.⁶ Huopaniemi, J.⁷

21
- 0003901864
- New York: John Wiley and Sons, Inc.
- Gold B., Morgan N. Speech and Audio Signal Processing. 2000;John Wiley and Sons, Inc. New York.
- (2000) Speech and Audio Signal Processing
- Gold, B.¹ Morgan, N.²

22
- 24444458549
- Importance of early and late reflections for automatic speech recognition in reverberant environments
- Gölzer H., Kleinschmidt M. Importance of early and late reflections for automatic speech recognition in reverberant environments. Proc. Elektronische Sprachsignalverarbeitung (ESSV). 2003.
- (2003) Proc. Elektronische Sprachsignalverarbeitung (ESSV)
- Gölzer, H.¹ Kleinschmidt, M.²

23
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- Hermansky H. Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87:1990;1738-1752.
- (1990) J. Acoust. Soc. Am. , vol.87 , pp. 1738-1752
- Hermansky, H.¹

24
- 0032139768
- Should recognisers have ears?
- Hermansky H. Should recognisers have ears? Speech Comm. 25:1998;3-27.
- (1998) Speech Comm. , vol.25 , pp. 3-27
- Hermansky, H.¹

25
- 0028517164
- RASTA processing of speech
- Hermansky H., Morgan N. RASTA processing of speech. IEEE Trans. Speech Audio Proc. 2:1994;578-589.
- (1994) IEEE Trans. Speech Audio Proc. , vol.2 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

26
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- Hermansky H., Ellis D.P.W., Sharma S. Tandem connectionist feature extraction for conventional HMM systems. Proc. ICASSP-2000. III:2000;1635-1638.
- (2000) Proc. ICASSP-2000 III , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.P.W.² Sharma, S.³

27
- 0028996871
- Noise estimation techniques for robust speech recognition
- Hirsch H.G., Erlicher C. Noise estimation techniques for robust speech recognition. Proc. ICASSP-1995. I:1995;153-156.
- (1995) Proc. ICASSP-1995 I , pp. 153-156
- Hirsch, H.G.¹ Erlicher, C.²

28
- 84873312246
- A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria
- Houtgast T., Steeneken H.J.M. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J. Acoust. Soc. Am. 77:1985;1069-1077.
- (1985) J. Acoust. Soc. Am. , vol.77 , pp. 1069-1077
- Houtgast, T.¹ Steeneken, H.J.M.²

29
- 0003905759
- New York: John Wiley and Sons
- Hyvärinen A., Karhunen J., Oja E. Independent Component Analysis. 2001;John Wiley and Sons, New York.
- (2001) Independent Component Analysis
- Hyvärinen, A.¹ Karhunen, J.² Oja, E.³

30
- 0004270063
- ISO 3382, ISO, Geneva, Switzerland
- ISO 3382, 1997. Acoustics - Measurement of the Reverberation Time of Rooms with Reference to Other Acoustical Parameters, second ed. ISO, Geneva, Switzerland.
- (1997) Acoustics - Measurement of the Reverberation Time of Rooms with Reference to Other Acoustical Parameters, Second ED.

31
- 0038404463
- ITU-T recommendation G.712, International Telecommunications Union, Geneva
- ITU-T recommendation G.712, 1996a. Transmission performance characteristics of pulse code modulated channels. International Telecommunications Union, Geneva.
- (1996) Transmission Performance Characteristics of Pulse Code Modulated Channels

32
- 0003657183
- ITU-T recommendation P.830, International Telecommunications Union, Geneva
- ITU-T recommendation P.830, 1996b. Subjective performance assessment of telephone band and wide band digital codecs. International Telecommunications Union, Geneva.
- (1996) Subjective Performance Assessment of Telephone Band and Wide Band Digital Codecs

33
- 0032676337
- On the relative importance of various components of the modulation spectrum for automatic speech recognition
- Kandera N., Arai T., Hermansky H., Pavel M. On the relative importance of various components of the modulation spectrum for automatic speech recognition. Speech Comm. 28:1999;43-55.
- (1999) Speech Comm. , vol.28 , pp. 43-55
- Kandera, N.¹ Arai, T.² Hermansky, H.³ Pavel, M.⁴

34
- 0003434858
- PhD thesis, University of California, Berkeley
- Kingsbury, B.E.D., 1998. Perceptually inspired signal-processing strategies for robust speech recognition in reverberant environments. PhD thesis, University of California, Berkeley.
- (1998) Perceptually Inspired Signal-processing Strategies for Robust Speech Recognition in Reverberant Environments
- Kingsbury, B.E.D.¹

35
- 0032136330
- Robust speech recognition using the modulation spectrogram
- Kingsbury B.E.D., Morgan N., Greenberg S. Robust speech recognition using the modulation spectrogram. Speech Comm. 25:1998;117-132.
- (1998) Speech Comm. , vol.25 , pp. 117-132
- Kingsbury, B.E.D.¹ Morgan, N.² Greenberg, S.³

36
- 0037211087
- Sub-band SNR estimation using auditory feature processing
- Kleinschmidt M., Hohmann V. Sub-band SNR estimation using auditory feature processing. Speech Comm. 39(1-2):2003;47-64.
- (2003) Speech Comm. , vol.39 , Issue.1-2 , pp. 47-64
- Kleinschmidt, M.¹ Hohmann, V.²

37
- 0035308233
- Classification of general audio data for content-based retrieval
- Li D., Sethi I.K., Dimitrova N., McGee T. Classification of general audio data for content-based retrieval. Pattern Recognition Lett. 22:2001;533-544.
- (2001) Pattern Recognition Lett. , vol.22 , pp. 533-544
- Li, D.¹ Sethi, I.K.² Dimitrova, N.³ Mcgee, T.⁴

38
- 0031187171
- Speech recognition by machines and humans
- Lippmann R.P. Speech recognition by machines and humans. Speech Comm. 22:1997;1-15.
- (1997) Speech Comm. , vol.22 , pp. 1-15
- Lippmann, R.P.¹

39
- 0038422099
- Single gauss model set-based data imputation method for complex ASR task
- Luo Y., Du L. Single gauss model set-based data imputation method for complex ASR task. Proc. ISCAS 2003. II:2003;564-567.
- (2003) Proc. ISCAS 2003 II , pp. 564-567
- Luo, Y.¹ Du, L.²

40
- 0003407268
- PhD thesis, MIT, Massachusetts
- Martin, K.D., 1999. Sound source recognition. PhD thesis, MIT, Massachusetts.
- (1999) Sound Source Recognition
- Martin, K.D.¹

41
- 0001797537
- An efficient algorithm to estimate the instantaneous SNR of speech signals
- Martin R. An efficient algorithm to estimate the instantaneous SNR of speech signals. Proc. Eurospeech-1993. 1993;37-40.
- (1993) Proc. Eurospeech-1993 , pp. 37-40
- Martin, R.¹

42
- 2942626614
- MATLAB release 13 reference manual. Natick, MA
- Mathworks, Inc., 2003. MATLAB release 13 reference manual. Natick, MA.
- (2003) Mathworks, Inc.

43
- 0003789815
- Cambridge, UK: Academic Press
- Moore B.C.J. An Introduction to the Psychology of Hearing. fifth ed. 2003;Academic Press, Cambridge, UK.
- (2003) An Introduction to the Psychology of Hearing, Fifth Ed.
- Moore, B.C.J.¹

44
- 2942557838
- Analysis of noise PDF transformation in secondary feature processing
- IDIAP, Martigny, Switzerland
- Morris, A.C., 2002. Analysis of noise PDF transformation in secondary feature processing. IDIAP Research Report 02-29, IDIAP, Martigny, Switzerland.
- (2002) IDIAP Research Report , vol.2 , Issue.29
- Morris, A.C.¹

45
- 84892151303
- Some solutions to the missing feature problem in the classification, with application to noise-robust ASR
- Morris A.C., Cooke M.P., Green P.D. Some solutions to the missing feature problem in the classification, with application to noise-robust ASR. Proc. ICASSP-1998. II:1998;737-740.
- (1998) Proc. ICASSP-1998 II , pp. 737-740
- Morris, A.C.¹ Cooke, M.P.² Green, P.D.³

46
- 0020325263
- Monaural and binaural speech perception in reverberation for listeners of various ages
- Nabelek A.K., Robinson P.K. Monaural and binaural speech perception in reverberation for listeners of various ages. J. Acoust. Soc. Amer. 71:1982;1242-1248.
- (1982) J. Acoust. Soc. Amer. , vol.71 , pp. 1242-1248
- Nabelek, A.K.¹ Robinson, P.K.²

47
- 0032142014
- Environmental conditions and acoustic transduction in hands-free speech recognition
- Omologo M., Svaizer P., Matassoni M. Environmental conditions and acoustic transduction in hands-free speech recognition. Speech Comm. 25:1998;75-95.
- (1998) Speech Comm. , vol.25 , pp. 75-95
- Omologo, M.¹ Svaizer, P.² Matassoni, M.³

48
- 0003513556
- Prentice Hall
- Oppenheim A.V., Schafer R.W. Discrete Time Signal Processing. 1989;Prentice Hall.
- (1989) Discrete Time Signal Processing
- Oppenheim, A.V.¹ Schafer, R.W.²

49
- 24444447717
- A binaural auditory model for missing data speech recognition in noisy and reverberant conditions
- Aalborg, 2nd September
- Palomäki, K.J., Brown, G.J., Wang, D.L., 2001. A binaural auditory model for missing data speech recognition in noisy and reverberant conditions. In: Proc. CRAC Eurospeech-2001 satellite workshop, Aalborg, 2nd September.
- (2001) Proc. CRAC Eurospeech-2001 Satellite Workshop
- Palomäki, K.J.¹ Brown, G.J.² Wang, D.L.³

50
- 0036298106
- Missing data speech recognition in reverberant conditions
- Palomäki K.J., Brown G.J., Barker J. Missing data speech recognition in reverberant conditions. Proc. ICASSP-2002. I:2002;65-68.
- (2002) Proc. ICASSP-2002 I , pp. 65-68
- Palomäki, K.J.¹ Brown, G.J.² Barker, J.³

51
- 84902042740
- A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation
- in press
- Palomäki, K.J., Brown, G.J., Wang, D.L., in press. A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation. Speech Comm.
- Speech Comm
- Palomäki, K.J.¹ Brown, G.J.² Wang, D.L.³

52
- 0004412846
- SVOS final report: The auditory fiterbank
- Patterson, R.D., Holdsworth, J.W., Nimmo-Smith, I., Rice, P., 1988. SVOS Final Report: The Auditory Fiterbank. APL Report 2341.
- (1988) APL Report , vol.2341
- Patterson, R.D.¹ Holdsworth, J.W.² Nimmo-Smith, I.³ Rice, P.⁴

53
- 84987702417
- The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- Pearce D., Hirsch H.-G. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Proc. ICSLP-2000. 4:2000;29-32.
- (2000) Proc. ICSLP-2000 , vol.4 , pp. 29-32
- Pearce, D.¹ Hirsch, H.-G.²

54
- 2942626613
- Computational auditory scene recognition
- Peltonen V., Tuomi J., Klapuri A., Huopaniemi J., Sorsa T. Computational auditory scene recognition. ICASSP-2002. II:2002;1941-1944.
- (2002) ICASSP-2002 II , pp. 1941-1944
- Peltonen, V.¹ Tuomi, J.² Klapuri, A.³ Huopaniemi, J.⁴ Sorsa, T.⁵

55
- 0038331253
- PhD thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania
- Raj, B., 2000. Reconstruction of incomplete spectrograms for robust speech recognition. PhD thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania.
- (2000) Reconstruction of Incomplete Spectrograms for Robust Speech Recognition
- Raj, B.¹

56
- 85057633672
- Reconstruction of missing features for robust speech recognition
- in press
- Raj, B., Seltzer, M.L., Stern, R.M., in press. Reconstruction of Missing Features for Robust Speech Recognition, Speech Comm.
- Speech Comm
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

57
- 84881675408
- Cepstral channel normalisation techniques for HMM-based speaker verification
- Rosenberg A.E., Lee C.-H., Soong F.K. Cepstral channel normalisation techniques for HMM-based speaker verification. Proc. ICSLP 94. 4:1994;1835-1838.
- (1994) Proc. ICSLP 94 , vol.4 , pp. 1835-1838
- Rosenberg, A.E.¹ Lee, C.-H.² Soong, F.K.³

58
- 2942623044
- Step by step guide to using the speech training and recognition unified tool STRUT
- STRUT Version 2.4, 1997. Step by step guide to using the speech training and recognition unified tool STRUT. Available from 〈www.tcts.fpms.ac.be/ asr/project/strut/〉.
- (1997) STRUT Version 2.4

59
- 2942526349
- Cambridge University Engineering Department, Available from 〈htk.eng.cam.ac.uk/〉
- Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D.,Valtchev, V., Woodland, P., 2001. The HTK Book, Revised for Version 3.1, Cambridge University Engineering Department, Available from 〈htk.eng.cam. ac.uk/〉.
- (2001) The HTK Book, Revised for Version 3.1
- Young, S.¹ Evermann, G.² Kershaw, D.³ Moore, G.⁴ Odell, J.⁵ Ollason, D.⁶ Valtchev, V.⁷ Woodland, P.⁸

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.