메뉴 건너뛰기




Volumn 2009, Issue , 2009, Pages

Recognition of noisy speech: A comparative survey of robust model architecture and feature enhancement

Author keywords

[No Author keywords available]

Indexed keywords


EID: 67650135931     PISSN: 16874714     EISSN: 16874722     Source Type: Journal    
DOI: 10.1155/2009/942617     Document Type: Article
Times cited : (38)

References (73)
  • 2
    • 0032785783 scopus 로고    scopus 로고
    • Auditory processing of speech signals for robust speech recognition in real-world noisy environments
    • Kim D.-S., Lee S.-Y., Kil R. M., Auditory processing of speech signals for robust speech recognition in real-world noisy environments IEEE Transactions on Speech and Audio Processing 1999 7 1 55 69
    • (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.1 , pp. 55-69
    • Kim, D.-S.1    Lee, S.-Y.2    Kil, R.M.3
  • 3
    • 67650206594 scopus 로고    scopus 로고
    • Proceedings of ISCA Workshop on Robustness in Conversational Interaction (Robust 04) August 2004 Norwich, UK
    • Rose R. C., Environmental robustness in automatic speech recognition Proceedings of ISCA Workshop on Robustness in Conversational Interaction (Robust 04) August 2004 Norwich, UK
    • Environmental robustness in automatic speech recognition
    • Rose, R.C.1
  • 6
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • DOI 10.1121/1.399423
    • Hermansky H., Perceptual linear predictive (PLP) analysis of speech The Journal of the Acoustical Society of America 1990 87 4 1738 1752 (Pubitemid 20256470)
    • (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 7
    • 0030127017 scopus 로고    scopus 로고
    • Signal conditioning techniques for robust speech recognition
    • PII S1070990896032531
    • Rahim M. G., Juang B.-H., Chou W., Buhrke E., Signal conditioning techniques for robust speech recognition IEEE Signal Processing Letters 1996 3 4 107 109 (Pubitemid 126518955)
    • (1996) IEEE Signal Processing Letters , vol.3 , Issue.4 , pp. 107-109
    • Rahim, M.G.1    Juang, B.-H.2    Chou, W.3    Buhrke, E.4
  • 8
    • 0032141206 scopus 로고    scopus 로고
    • Cepstral domain segmental feature vector normalization for noise robust speech recognition
    • PII S0167639398000338
    • Viikki O., Laurila K., Cepstral domain segmental feature vector normalization for noise robust speech recognition Speech Communication 1998 25 13 133 147 (Pubitemid 128413638)
    • (1998) Speech Communication , vol.25 , Issue.1-3 , pp. 133-147
    • Viikki, O.1    Laurila, K.2
  • 10
    • 4544236840 scopus 로고    scopus 로고
    • Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 04) May 2004 Montreal, Canada
    • Droppo J., Acero A., Noise robust speech recognition with a switching linear dynamic model 1 Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 04) May 2004 Montreal, Canada 953 956
    • Noise robust speech recognition with a switching linear dynamic model , vol.1 , pp. 953-956
    • Droppo, J.1    Acero, A.2
  • 11
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • Rabiner L. R., A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE 1989 77 2 257 286
    • (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 12
    • 33745185781 scopus 로고    scopus 로고
    • Hidden conditional random fields for phone classification
    • 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
    • Gunawardana A., Mahajan M., Acero A., Platt J. C., Hidden conditional random fields for phone classification Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 05) September 2005 Lisbon, Portugal 1117 1120 (Pubitemid 43908262)
    • (2005) 9th European Conference on Speech Communication and Technology , pp. 1117-1120
    • Gunawardana, A.1    Mahajan, M.2    Acero, A.3    Platt, J.C.4
  • 13
    • 13244265597 scopus 로고    scopus 로고
    • Revisiting autoregressive hidden Markov modeling of speech signals
    • DOI 10.1109/LSP.2004.840914
    • Ephraim Y., Roberts W. J. J., Revisiting autoregressive hidden Markov modeling of speech signals IEEE Signal Processing Letters 2005 12 2 166 169 (Pubitemid 40181881)
    • (2005) IEEE Signal Processing Letters , vol.12 , Issue.2 , pp. 166-169
    • Ephraim, Y.1    Roberts, W.J.J.2
  • 16
    • 0032027527 scopus 로고    scopus 로고
    • Nonstationary environment compensation based on sequential estimation
    • Kim N. S., Nonstationary environment compensation based on sequential estimation IEEE Signal Processing Letters 1998 5 3 57 59 (Pubitemid 128556794)
    • (1998) IEEE Signal Processing Letters , vol.5 , Issue.3 , pp. 57-59
    • Kim, N.S.1
  • 18
    • 33745225168 scopus 로고    scopus 로고
    • Comb filter decomposition for robust ASR
    • 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
    • Szymanski L., Bouchard M., Comb filter decomposition for robust ASR Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 05) September 2005 Lisbon, Portugal 2645 2648 (Pubitemid 43908639)
    • (2005) 9th European Conference on Speech Communication and Technology , pp. 2645-2648
    • Szymanski, L.1    Bouchard, M.2
  • 23
    • 67650182031 scopus 로고    scopus 로고
    • Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 05) November 2005 San Juan, Puerto Rico, USA
    • Lathoud G., Doss M. M., Boulard H., Channel normalization for unsupervised spectral subtraction Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 05) November 2005 San Juan, Puerto Rico, USA
    • Channel normalization for unsupervised spectral subtraction
    • Lathoud, G.1    Doss, M.M.2    Boulard, H.3
  • 24
    • 0030779363 scopus 로고    scopus 로고
    • Noise compensation methods for hidden markov model speech recognition in adverse environments
    • PII S1063667697007670
    • Vaseghi S. V., Milner B. P., Noise compensation methods for hidden Markov model speech recognition in adverse environments IEEE Transactions on Speech and Audio Processing 1997 5 1 11 21 (Pubitemid 127746030)
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , Issue.1 , pp. 11-21
    • Vaseghi, S.V.1    Milner, B.P.2
  • 26
    • 0347968277 scopus 로고    scopus 로고
    • Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition
    • Deng L., Droppo J., Acero A., Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition IEEE Transactions on Speech and Audio Processing 2003 11 6 568 580
    • (2003) IEEE Transactions on Speech and Audio Processing , vol.11 , Issue.6 , pp. 568-580
    • Deng, L.1    Droppo, J.2    Acero, A.3
  • 27
    • 0033690878 scopus 로고    scopus 로고
    • Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 00) June 2000 Istanbul, Turkey
    • Zhu Q., Alwan A., On the use of variable frame rate analysis in speech recognition 3 Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 00) June 2000 Istanbul, Turkey 1783 1786
    • On the use of variable frame rate analysis in speech recognition , vol.3 , pp. 1783-1786
    • Zhu, Q.1    Alwan, A.2
  • 29
    • 84946730259 scopus 로고    scopus 로고
    • Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 03) November-December 2003 St. Thomas, Virgin Islands, USA
    • Hermansky H., TRAP-TANDEM: data-driven extraction of temporal features from speech Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 03) November-December 2003 St. Thomas, Virgin Islands, USA 255 260
    • TRAP-TANDEM: Data-driven extraction of temporal features from speech , pp. 255-260
    • Hermansky, H.1
  • 31
    • 0002915083 scopus 로고    scopus 로고
    • Relevance of time-frequency features for phonetic and speaker-channel classification
    • Yang H. H., van Vuuren S., Sharma S., Hermansky H., Relevance of time-frequency features for phonetic and speaker-channel classification Speech Communication 2000 31 1 35 50
    • (2000) Speech Communication , vol.31 , Issue.1 , pp. 35-50
    • Yang, H.H.1    Van Vuuren, S.2    Sharma, S.3    Hermansky, H.4
  • 32
    • 0032676337 scopus 로고    scopus 로고
    • On the relative importance of various components of the modulation spectrum for automatic speech recognition
    • Kanedera N., Arai T., Hermansky H., Pavel M., On the relative importance of various components of the modulation spectrum for automatic speech recognition Speech Communication 1999 28 1 43 55
    • (1999) Speech Communication , vol.28 , Issue.1 , pp. 43-55
    • Kanedera, N.1    Arai, T.2    Hermansky, H.3    Pavel, M.4
  • 33
    • 85016663198 scopus 로고    scopus 로고
    • Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 92) March 1992 San Francisco, Calif, USA
    • Hermansky H., Morgan N., Bayya A., Kohn P., RASTA-PLP speech analysis technique 1 Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 92) March 1992 San Francisco, Calif, USA 121 124
    • RASTA-PLP speech analysis technique , vol.1 , pp. 121-124
    • Hermansky, H.1    Morgan, N.2    Bayya, A.3    Kohn, P.4
  • 34
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • PII S0167639398000326
    • Kingsbury B. E. D., Morgan N., Greenberg S., Robust speech recognition using the modulation spectrogram Speech Communication 1998 25 13 117 132 (Pubitemid 128413637)
    • (1998) Speech Communication , vol.25 , Issue.1-3 , pp. 117-132
    • Kingsbury, B.E.D.1    Morgan, N.2    Greenberg, S.3
  • 36
    • 0035309967 scopus 로고    scopus 로고
    • An advanced contrast enhancement using partially overlapped sub-block histogram equalization
    • DOI 10.1109/76.915354, PII S1051821501030117
    • Kim J.-Y., Kim L.-S., Hwang S.-H., An advanced contrast enhancement using partially overlapped sub-block histogram equalization IEEE Transactions on Circuits and Systems for Video Technology 2001 11 4 475 484 (Pubitemid 32407181)
    • (2001) IEEE Transactions on Circuits and Systems for Video Technology , vol.11 , Issue.4 , pp. 475-484
    • Kim, J.-Y.1    Kim, L.-S.2    Hwang, S.-H.3
  • 37
    • 0042362207 scopus 로고    scopus 로고
    • Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments
    • Kim H. K., Rose R. C., Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments IEEE Transactions on Speech and Audio Processing 2003 11 5 435 446
    • (2003) IEEE Transactions on Speech and Audio Processing , vol.11 , Issue.5 , pp. 435-446
    • Kim, H.K.1    Rose, R.C.2
  • 38
    • 54349096464 scopus 로고    scopus 로고
    • Noisy speech feature estimation on the Aurora2 database using a switching linear dynamic model
    • Deng J., Bouchard M., Yeap T. H., Noisy speech feature estimation on the Aurora2 database using a switching linear dynamic model Journal of Multimedia 2007 2 2 47 52
    • (2007) Journal of Multimedia , vol.2 , Issue.2 , pp. 47-52
    • Deng, J.1    Bouchard, M.2    Yeap, T.H.3
  • 43
  • 44
    • 38149014113 scopus 로고    scopus 로고
    • Proceedings of the 17th International Conference on Artificial Neural Networks (ICANN 07) September 2007 Porto, Portugal Lecture Notes in Computer Science
    • Fernndez S., Graves A., Schmidhuber J., An application of recurrent neural networks to discriminative keyword spotting 4669 Proceedings of the 17th International Conference on Artificial Neural Networks (ICANN 07) September 2007 Porto, Portugal 220 229 Lecture Notes in Computer Science
    • An application of recurrent neural networks to discriminative keyword spotting , vol.4669 , pp. 220-229
    • Fernndez, S.1    Graves, A.2    Schmidhuber, J.3
  • 46
    • 85009242725 scopus 로고    scopus 로고
    • Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 02) September 2002 Denver, Colo, USA
    • Macho D., Mauuray L., Noe B., Evaluation of a noise-robust DSR front-end on Aurora database Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 02) September 2002 Denver, Colo, USA 17 20
    • Evaluation of a noise-robust DSR front-end on Aurora database , pp. 17-20
    • MacHo, D.1    Mauuray, L.2    Noe, B.3
  • 47
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Gauvain J.-L., Lee C.-H., Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains IEEE Transactions on Speech and Audio Processing 1994 2 2 291 298
    • (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.-L.1    Lee, C.-H.2
  • 51
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
    • Ephraim Y., Malah D., Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator IEEE Transactions on Acoustics, Speech, and Signal Processing 1984 32 6 1109 1121
    • (1984) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 55
    • 34047249084 scopus 로고    scopus 로고
    • Quantile based histogram equalization for noise robust large vocabulary speech recognition
    • DOI 10.1109/TSA.2005.857792
    • Hilger F., Ney H., Quantile based histogram equalization for noise robust large vocabulary speech recognition IEEE Transactions on Audio, Speech and Language Processing 2006 14 3 845 854 (Pubitemid 46547647)
    • (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.3 , pp. 845-854
    • Hilger, F.1    Ney, H.2
  • 56
    • 54349123450 scopus 로고    scopus 로고
    • Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 03) September 2003 Geneva, Switzerland
    • Droppo J., Deng L., Acero A., A comparison of three non-linear observation models for noisy speech features 2 Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 03) September 2003 Geneva, Switzerland 681 684
    • A comparison of three non-linear observation models for noisy speech features , vol.2 , pp. 681-684
    • Droppo, J.1    Deng, L.2    Acero, A.3
  • 59
    • 33646436650 scopus 로고    scopus 로고
    • Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology May-June 2003 Edmonton, Canada
    • Sha F., Pereira F., Shallow parsing with conditional random fields 1 Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology May-June 2003 Edmonton, Canada 134 141
    • Shallow parsing with conditional random fields , vol.1 , pp. 134-141
    • Sha, F.1    Pereira, F.2
  • 60
    • 1542287488 scopus 로고    scopus 로고
    • Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 03) July-August 2003 Toronto, Canada
    • Pinto D., McCallum A., Wei X., Croft W. B., Table extraction using conditional random fields Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 03) July-August 2003 Toronto, Canada 235 242
    • Table extraction using conditional random fields , pp. 235-242
    • Pinto, D.1    McCallum, A.2    Wei, X.3    Croft, W.B.4
  • 64
    • 48249106592 scopus 로고    scopus 로고
    • Proceedings of the 4th IEEE Tutorial and Research Workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems (PIT 08) June 2008 Kloster Irsee, Germany
    • Schuller B., Eyben F., Rigoll G., Static and dynamic modelling for the recognition of non-verbal vocalisations in conversational speech Proceedings of the 4th IEEE Tutorial and Research Workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems (PIT 08) June 2008 Kloster Irsee, Germany 99 110
    • Static and dynamic modelling for the recognition of non-verbal vocalisations in conversational speech , pp. 99-110
    • Schuller, B.1    Eyben, F.2    Rigoll, G.3
  • 65
    • 21644483999 scopus 로고
    • Maximum likelihood estimates of linear dynamic systems
    • Rauch H. E., Tung G., Striebel C. T., Maximum likelihood estimates of linear dynamic systems AIAA Journal 1965 3 8 1445 1450
    • (1965) AIAA Journal , vol.3 , Issue.8 , pp. 1445-1450
    • Rauch, H.E.1    Tung, G.2    Striebel, C.T.3
  • 66
    • 33845270980 scopus 로고    scopus 로고
    • Expectation correction for smoothed inference in switching linear dynamical systems
    • Barber D., Expectation correction for smoothed inference in switching linear dynamical systems Journal of Machine Learning Research 2006 7 2515 2540 (Pubitemid 44866739)
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 2515-2540
    • Barber, D.1
  • 67
    • 0019612337 scopus 로고
    • Speech recognition: Turning theory to practice
    • Doddington G. R., Schalk T. B., Speech recognition: turning theory to practice IEEE Spectrum 1981 18 9 26 32
    • (1981) IEEE Spectrum , vol.18 , Issue.9 , pp. 26-32
    • Doddington, G.R.1    Schalk, T.B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.