SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 52, Issue 1, 2010, Pages 72-81

Robust speech recognition by integrating speech separation and hypothesis testing

(2) Srinivasan, Soundararajan a Wang, DeLiang b

a OHIO STATE UNIVERSITY (United States)

b The Ohio State University (United States)

Author keywords

Ideal binary mask; Missing data recognizer; Robust speech recognition; Speech segregation; Top down processing

Indexed keywords

IDEAL BINARY MASK; MISSING-DATA RECOGNIZER; ROBUST SPEECH RECOGNITION; SPEECH SEGREGATION; TOP-DOWN PROCESSING;

F REGION; SEGREGATION (METALLOGRAPHY); SEPARATION; SPEECH ANALYSIS; TIME DOMAIN ANALYSIS;

SPEECH RECOGNITION;

EID: 70350038037 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2009.08.008 Document Type: Article

Times cited : (21)

References (39)

1
- 11144316019
- Decoding speech in the presence of other sources
- Barker J.P., Cooke M.P., and Ellis D.P.W. Decoding speech in the presence of other sources. Speech Communication 45 (2005) 5-25
- (2005) Speech Communication , vol.45 , pp. 5-25
- Barker, J.P.¹ Cooke, M.P.² Ellis, D.P.W.³

2
- 0028480283
- Novelty detection and neural network validation
- Bishop C.M. Novelty detection and neural network validation. IEEE Proceedings of the Vision, Image and Signal processing 141 4 (1994) 217-222
- (1994) IEEE Proceedings of the Vision, Image and Signal processing , vol.141 , Issue.4 , pp. 217-222
- Bishop, C.M.¹

3
- 70350032356
- Boersma, P, Weenink, D, 2002. Praat: doing phonetics by computer. Version 4.0.26. Last viewed on 24 October 2007. URL
- Boersma, P., Weenink, D., 2002. Praat: doing phonetics by computer. Version 4.0.26. Last viewed on 24 October 2007. URL .

4
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Boll S.F. Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP 27 2 (1979) 113-120
- (1979) IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP , vol.27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

5
- 0003684441
- The MIT Press, Cambridge, MA
- Bregman A.S. Auditory Scene Analysis (1990), The MIT Press, Cambridge, MA
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

6
- 0034850070
- A neural oscillator sound separator for missing data speech recognition
- Brown, G.J., Barker, J., Wang, D.L., 2001. A neural oscillator sound separator for missing data speech recognition. In: Proceedings of the International Joint Conference on Neural Networks '01, pp. 2907-2912.
- (2001) Proceedings of the International Joint Conference on Neural Networks '01 , pp. 2907-2912
- Brown, G.J.¹ Barker, J.² Wang, D.L.³

7
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- Cooke M.P., Green P., Josifovski L., and Vizinho A. Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication 34 (2001) 267-285
- (2001) Speech Communication , vol.34 , pp. 267-285
- Cooke, M.P.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

8
- 38549171780
- Listening to speech in the presence of other sounds
- Darwin C.J. Listening to speech in the presence of other sounds. Philosophical Transactions of the Royal Society B: Biological Sciences 363 (2008) 1011-1021
- (2008) Philosophical Transactions of the Royal Society B: Biological Sciences , vol.363 , pp. 1011-1021
- Darwin, C.J.¹

9
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Davis S.B., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP 28 4 (1980) 357-366
- (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP , vol.28 , Issue.4 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

10
- 85009211607
- A nonlinear observation model for removing noise from corrupted speech log mel-spectral energies
- Droppo, J., Acero, A., Deng, L., 2002. A nonlinear observation model for removing noise from corrupted speech log mel-spectral energies. In: Proceedings of the International Conference on Spoken Language Processing '02, pp. 1569-1572.
- (2002) Proceedings of the International Conference on Spoken Language Processing '02 , pp. 1569-1572
- Droppo, J.¹ Acero, A.² Deng, L.³

11
- 0031619912
- Speaker verification in noisy environment with combined spectral subtraction and missing data theory
- 98, 1, pp
- Drygajlo, A., El-Maliki, M., 1998. Speaker verification in noisy environment with combined spectral subtraction and missing data theory. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing '98, vol. 1, pp. 121-124.
- (1998) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 121-124
- Drygajlo, A.¹ El-Maliki, M.²

12
- 0026843273
- A Bayesian estimation approach for speech enhancement using hidden Markov models
- Ephraim Y. A Bayesian estimation approach for speech enhancement using hidden Markov models. IEEE Transactions on Signal Processing 40 4 (1992) 725-735
- (1992) IEEE Transactions on Signal Processing , vol.40 , Issue.4 , pp. 725-735
- Ephraim, Y.¹

13
- 70349227947
- The application of Hidden Markov models in speech recognition
- Gales M., and Young S. The application of Hidden Markov models in speech recognition. Foundations and Trends in Signal Processing 1 3 (2007) 195-304
- (2007) Foundations and Trends in Signal Processing , vol.1 , Issue.3 , pp. 195-304
- Gales, M.¹ Young, S.²

14
- 0029288202
- Speech recognition in noisy environments: a survey
- Gong Y. Speech recognition in noisy environments: a survey. Speech Communication 16 (1995) 261-291
- (1995) Speech Communication , vol.16 , pp. 261-291
- Gong, Y.¹

15
- 4644265990
- Monaural speech segregation based on pitch tracking and amplitude modulation
- Hu G., and Wang D.L. Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Transactions on Neural Networks 15 (2004) 1135-1150
- (2004) IEEE Transactions on Neural Networks , vol.15 , pp. 1135-1150
- Hu, G.¹ Wang, D.L.²

16
- 0004056285
- Prentice-Hall, PTR, Upper Saddle River, NJ
- Huang X., Acero A., and Hon H. Spoken Language Processing (2001), Prentice-Hall, PTR, Upper Saddle River, NJ
- (2001) Spoken Language Processing
- Huang, X.¹ Acero, A.² Hon, H.³

17
- 0021226391
- A database for speaker-independent digit recognition
- 84, pp
- Leonard, R.G., 1984. A database for speaker-independent digit recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing '84, pp. 111-114.
- (1984) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 111-114
- Leonard, R.G.¹

18
- 34447100796
- Taylor and Francis, Boca Raton, FL
- Loizou P. Speech Enhancement: Theory and Practice (2007), Taylor and Francis, Boca Raton, FL
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.¹

19
- 0142063407
- Novelty detection: a review-part 1: statistical approaches
- Markou M., and Singh S. Novelty detection: a review-part 1: statistical approaches. Signal Processing 83 12 (2003) 2481-2497
- (2003) Signal Processing , vol.83 , Issue.12 , pp. 2481-2497
- Markou, M.¹ Singh, S.²

20
- 0346707504
- Microphone array post-filter based on noise field coherence
- McCowan I., and Bourlard H. Microphone array post-filter based on noise field coherence. IEEE Transactions on Speech and Audio Processing 11 6 (2003) 709-716
- (2003) IEEE Transactions on Speech and Audio Processing , vol.11 , Issue.6 , pp. 709-716
- McCowan, I.¹ Bourlard, H.²

21
- 0003891734
- Marcel Dekker, NY, NY
- McLachlan G.J., and Basford K.E. Mixture Models: Inference and Applications to Clustering (1988), Marcel Dekker, NY, NY
- (1988) Mixture Models: Inference and Applications to Clustering
- McLachlan, G.J.¹ Basford, K.E.²

22
- 0142056390
- An efficient auditory filterbank based on the gammatone function
- 2341
- Patterson, R.D., Nimmo-Smith, I., Holdsworth, J., Rice, P., 1988. An efficient auditory filterbank based on the gammatone function. Applied Psychology Unit (APU) Report 2341.
- (1988) Applied Psychology Unit (APU) Report
- Patterson, R.D.¹ Nimmo-Smith, I.² Holdsworth, J.³ Rice, P.⁴

23
- 84987702417
- The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- Pearce, D., Hirsch, H., 2000. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In: Proceedings of the Sixth International Conference on Spoken Language Processing '00, pp. 29-32.
- (2000) Proceedings of the Sixth International Conference on Spoken Language Processing '00 , pp. 29-32
- Pearce, D.¹ Hirsch, H.²

24
- 11144343436
- Detection of reliable features for speech recognition in noisy conditions using a statistical criterion
- Renevey, P., Drygajlo, A., 2001. Detection of reliable features for speech recognition in noisy conditions using a statistical criterion. In: Proceedings of the Consistent and Reliable Acoustic Cues for Sound Analysis Workshop '01, pp. 71-74.
- (2001) Proceedings of the Consistent and Reliable Acoustic Cues for Sound Analysis Workshop '01 , pp. 71-74
- Renevey, P.¹ Drygajlo, A.²

25
- 0142026377
- Speech segregation based on sound localization
- Roman N., Wang D.L., and Brown G.J. Speech segregation based on sound localization. The Journal of the Acoustical Society of America 114 (2003) 2236-2252
- (2003) The Journal of the Acoustical Society of America , vol.114 , pp. 2236-2252
- Roman, N.¹ Wang, D.L.² Brown, G.J.³

26
- 85009089485
- Classifier-based mask estimation for missing feature methods of robust speech recognition
- Seltzer, M.L., Raj, B., Stern, R.M., 2000. Classifier-based mask estimation for missing feature methods of robust speech recognition. In: Proceedings of the International Conference on Spoken Language Processing '00, pp. 538-541.
- (2000) Proceedings of the International Conference on Spoken Language Processing '00 , pp. 538-541
- Seltzer, M.L.¹ Raj, B.² Stern, R.M.³

27
- 56749102248
- Ph.D. Thesis, Biomedical Engineering Department, The Ohio State University
- Srinivasan, S., 2006. Integrating computational auditory scene analysis and automatic speech recognition. Ph.D. Thesis, Biomedical Engineering Department, The Ohio State University.
- (2006) Integrating computational auditory scene analysis and automatic speech recognition
- Srinivasan, S.¹

28
- 33750311718
- Binary and ratio time-frequency masks for robust speech recognition
- Srinivasan S., Roman N., and Wang D.L. Binary and ratio time-frequency masks for robust speech recognition. Speech Communication 48 (2006) 1486-1501
- (2006) Speech Communication , vol.48 , pp. 1486-1501
- Srinivasan, S.¹ Roman, N.² Wang, D.L.³

29
- 33646811168
- Robust speech recognition by integrating speech separation and hypothesis testing
- Srinivasan, S., Wang, D.L., 2005a. Robust speech recognition by integrating speech separation and hypothesis testing. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing '05, vol. 1, pp. 89-92.
- (2005) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing '05 , vol.1 , pp. 89-92
- Srinivasan, S.¹ Wang, D.L.²

30
- 11144339352
- A schema-based model for phonemic restoration
- Srinivasan S., and Wang D.L. A schema-based model for phonemic restoration. Speech Communication 45 (2005) 63-87
- (2005) Speech Communication , vol.45 , pp. 63-87
- Srinivasan, S.¹ Wang, D.L.²

31
- 0004111233
- Prentice-Hall, Upper Saddle River, NJ
- Stark H., and Woods J.W. Probability and Random Processes with Applications to Signal Processing. third ed. (2002), Prentice-Hall, Upper Saddle River, NJ
- (2002) Probability and Random Processes with Applications to Signal Processing. third ed.
- Stark, H.¹ Woods, J.W.²

32
- 84947734535
- Outlier detection using classifier instability
- Amin, A, Dori, D, Pudil, P, Freeman, H, Eds, Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition, Springer, Berlin, pp
- Tax, D.M.J., Duin, R.P.W., 1998. Outlier detection using classifier instability. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (Eds.), Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition, Lecture Notes in Computer Science, vol. 1451, Springer, Berlin, pp. 593-601.
- (1998) Lecture Notes in Computer Science , vol.1451 , pp. 593-601
- Tax, D.M.J.¹ Duin, R.P.W.²

33
- 4544315110
- Robust speech recognition using cepstral domain missing data techniques and noisy masks
- van Hamme, H., 2004. Robust speech recognition using cepstral domain missing data techniques and noisy masks. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing '04, vol. 1, pp. 213-216.
- (2004) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing '04 , vol.1 , pp. 213-216
- van Hamme, H.¹

34
- 0025681008
- Hidden Markov model decomposition of speech and noise
- 90, pp
- Varga, A.P., Moore, R.K., 1990. Hidden Markov model decomposition of speech and noise. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing '90, pp. 845-848.
- (1990) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 845-848
- Varga, A.P.¹ Moore, R.K.²

35
- 0004319968
- The NOISEX-92 study on the effect of additive noise on automatic speech recognition
- Technical Report, Speech Research Unit, Defense Research Agency, Malvern, UK
- Varga, A.P., Steeneken, H.J.M., Tomlinson, M., Jones, D., 1992. The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Technical Report, Speech Research Unit, Defense Research Agency, Malvern, UK.
- (1992)
- Varga, A.P.¹ Steeneken, H.J.M.² Tomlinson, M.³ Jones, D.⁴

36
- 0032682770
- Separation of speech from interfering sounds based on oscillatory correlation
- Wang D.L., and Brown G.J. Separation of speech from interfering sounds based on oscillatory correlation. IEEE Transactions on Neural Networks 10 3 (1999) 684-697
- (1999) IEEE Transactions on Neural Networks , vol.10 , Issue.3 , pp. 684-697
- Wang, D.L.¹ Brown, G.J.²

37
- 82255178542
- Wang D.L., and Brown G.J. (Eds), Wiley-IEEE Press, Hoboken, NJ
- In: Wang D.L., and Brown G.J. (Eds). Computational Auditory Scene Analysis: Principles, Algorithms and Applications (2006), Wiley-IEEE Press, Hoboken, NJ
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications

38
- 0033344872
- Confidence measures from local posterior probability estimates
- Williams G., and Renals S. Confidence measures from local posterior probability estimates. Computer Speech and Language 13 (1999) 395-413
- (1999) Computer Speech and Language , vol.13 , pp. 395-413
- Williams, G.¹ Renals, S.²

39
- 70350021067
- Young, S, Kershaw, D, Odell, J, Valtchev, V, Woodland, P, 2000. The HTK Book for HTK Version 3.0, Microsoft Corporation
- Young, S., Kershaw, D., Odell, J., Valtchev, V., Woodland, P., 2000. The HTK Book (for HTK Version 3.0). Microsoft Corporation.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.