메뉴 건너뛰기




Volumn 15, Issue 1, 2007, Pages 257-270

MVA processing of speech features

Author keywords

ARMA filter; Aurora 2.0; Aurora 3.0; Feature extraction; Front end processing; Mean subtraction; MFCC; Noise robustness; RASTA; Speech recognition; Temporal smoothing; Variance normalization

Indexed keywords

ARMA FILTER; AURORA 2.0; AURORA 3.0; FRONT END PROCESSING; MEAN SUBTRACTION; MFCC; NOISE ROBUSTNESS; RASTA; TEMPORAL SMOOTHING; VARIANCE NORMALIZATION;

EID: 42549139762     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2006.876717     Document Type: Article
Times cited : (182)

References (50)
  • 2
    • 64149096060 scopus 로고    scopus 로고
    • What HMMs can do Dept. of Elect. Eng., Univ.Washington, Seattle, WA
    • Tech. Rep. UWEETR-2002-003, Online, Available
    • J. Bilmes, What HMMs can do Dept. of Elect. Eng., Univ.Washington, Seattle, WA, Tech. Rep. UWEETR-2002-003, 2002 [Online]. Available: www.ee.washington.edu/techsite/papers26
    • (2002)
    • Bilmes, J.1
  • 3
    • 0030245363 scopus 로고    scopus 로고
    • From HMMs to segment models: A unified view of stochastic modeling for speech recognition
    • Sep
    • M. Ostendorf, V. Digalakis, and O. Kimball, "From HMMs to segment models: a unified view of stochastic modeling for speech recognition," IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 360-378, Sep. 1996.
    • (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.5 , pp. 360-378
    • Ostendorf, M.1    Digalakis, V.2    Kimball, O.3
  • 4
    • 0003805597 scopus 로고
    • The use of context in large vocabulary speech recognition,
    • Ph.D. dissertation, Univ. Cambridge, Cambridge, U.K
    • J. J. Odell, "The use of context in large vocabulary speech recognition," Ph.D. dissertation, Univ. Cambridge, Cambridge, U.K., 1995.
    • (1995)
    • Odell, J.J.1
  • 5
    • 64149101768 scopus 로고
    • Cepstral mean compensation for HMM recognition in noise
    • Cannes-Mandelieu, France
    • S. Young, "Cepstral mean compensation for HMM recognition in noise," in Proc. ESCA Workshop on Speech Processing in Adverse Conditions, Cannes-Mandelieu, France, 1992, pp. 123-126.
    • (1992) Proc. ESCA Workshop on Speech Processing in Adverse Conditions , pp. 123-126
    • Young, S.1
  • 7
    • 0009578471 scopus 로고    scopus 로고
    • Multi-Microphone Correlation-Based Processing for Robust Automatic Speech Recognition,
    • Ph.D. dissertation, Carnegie Mellon Univ, Pittsburgh, PA
    • T. M. Sullivan, "Multi-Microphone Correlation-Based Processing for Robust Automatic Speech Recognition," Ph.D. dissertation, Carnegie Mellon Univ., Pittsburgh, PA, 1996.
    • (1996)
    • Sullivan, T.M.1
  • 10
    • 0030635418 scopus 로고    scopus 로고
    • Joint distributional modeling with cross-correlation based features
    • J. A. Bilmes, "Joint distributional modeling with cross-correlation based features," in Proc. IEEE ASRU Workshop, 1997, pp. 148-155.
    • (1997) Proc. IEEE ASRU Workshop , pp. 148-155
    • Bilmes, J.A.1
  • 11
    • 0018455310 scopus 로고
    • Supression of acoustic noise in speech using spectral subtraction
    • S. Boll, "Supression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal Process , vol.ASSP-27 , Issue.2 , pp. 113-120
    • Boll, S.1
  • 12
    • 0016067897 scopus 로고
    • Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
    • B. S. Atal, "Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification," J. Acoust. Soc. Amer., vol. 55, pp. 1304-1312, 1974.
    • (1974) J. Acoust. Soc. Amer , vol.55 , pp. 1304-1312
    • Atal, B.S.1
  • 13
    • 0019555090 scopus 로고
    • Cepstral analysis technique for automatic speaker verification
    • Apr
    • S. Furui, "Cepstral analysis technique for automatic speaker verification," IEEE Trans. Acoust., Speech, Signal Process., vol. 29, no. 2, pp. 254-272, Apr. 1981.
    • (1981) IEEE Trans. Acoust., Speech, Signal Process , vol.29 , Issue.2 , pp. 254-272
    • Furui, S.1
  • 15
    • 0026925484 scopus 로고
    • Hidden Markov models with firstorder equalization for noisy speech recognition
    • Sep
    • B.-H. Juang and K. K. Paliwal, "Hidden Markov models with firstorder equalization for noisy speech recognition," IEEE Trans. Signal Process., vol. 40, no. 9, pp. 2136-2143, Sep. 1992.
    • (1992) IEEE Trans. Signal Process , vol.40 , Issue.9 , pp. 2136-2143
    • Juang, B.-H.1    Paliwal, K.K.2
  • 16
    • 0030149866 scopus 로고    scopus 로고
    • A maximum-likelihood approach to stochastic matching for robust speech recognition
    • May
    • A. Sankar and C.-H. Lee, "A maximum-likelihood approach to stochastic matching for robust speech recognition," IEEE Trans. Speech Audio Process., vol. 4, no. 3, pp. 190-202, May 1996.
    • (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.3 , pp. 190-202
    • Sankar, A.1    Lee, C.-H.2
  • 19
    • 0003940203 scopus 로고    scopus 로고
    • The Generation and Use of Regression Class Trees for MLLR Adaptation Dept. Eng., Univ. Cambridge
    • Tech. Rep. CUED/FINFENG/ TR263
    • M. J. F. Gales, The Generation and Use of Regression Class Trees for MLLR Adaptation Dept. Eng., Univ. Cambridge, Tech. Rep. CUED/FINFENG/ TR263, 1996.
    • (1996)
    • Gales, M.J.F.1
  • 20
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Aug
    • S. B. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Process., vol. 28, no. 4, pp. 357-366, Aug. 1980.
    • (1980) IEEE Trans. Acoust., Speech, Signal Process , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 24
    • 0030638031 scopus 로고    scopus 로고
    • A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)
    • Santa Barbara, CA
    • J. G. Fiscus, "A post-processing system to yield reduced word error rates: recognizer Output Voting Error Reduction (ROVER)," in Proceedings of IEEEWorkshop on Automatic Speech Recognition and Understanding, Santa Barbara, CA, 1997.
    • (1997) Proceedings of IEEEWorkshop on Automatic Speech Recognition and Understanding
    • Fiscus, J.G.1
  • 30
    • 85009106519 scopus 로고    scopus 로고
    • Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
    • J. Barker,M. Cooke, and P. Green, "Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise," in European Conf. Speech Communication and Technology (EuroSpeech), 2001, pp. 213-216.
    • (2001) European Conf. Speech Communication and Technology (EuroSpeech) , pp. 213-216
    • Barker, J.1    Cooke, M.2    Green, P.3
  • 34
    • 85009265586 scopus 로고    scopus 로고
    • Frontend post-processing and backend model enhancement on the Aurora 2.0/3.0 databases
    • C.-P. Chen, K. Filali, and J. Bilmes, "Frontend post-processing and backend model enhancement on the Aurora 2.0/3.0 databases," in Proc. Int. Conf. Spoken Lang. Process. (ICSLP), 2002, pp. 241-244.
    • (2002) Proc. Int. Conf. Spoken Lang. Process. (ICSLP) , pp. 241-244
    • Chen, C.-P.1    Filali, K.2    Bilmes, J.3
  • 36
    • 0027465491 scopus 로고
    • The Lombard reflex and its role on human listeners and automatic speech recognizers
    • January
    • J. C. Junqua, "The Lombard reflex and its role on human listeners and automatic speech recognizers," J. Acoust. Soc. Amer. (JASA), vol. 91, no. 1, pp. 510-524, January 1993.
    • (1993) J. Acoust. Soc. Amer. (JASA) , vol.91 , Issue.1 , pp. 510-524
    • Junqua, J.C.1
  • 37
    • 0032676337 scopus 로고    scopus 로고
    • On the relative importance of various components of the modulation spectrum for automatic speech recognition
    • N. Kanedera, T. Arai, H. Hermansky, and M. Pavel, "On the relative importance of various components of the modulation spectrum for automatic speech recognition," Speech Commun., vol. 28, no. 1, pp. 43-55, 1999.
    • (1999) Speech Commun , vol.28 , Issue.1 , pp. 43-55
    • Kanedera, N.1    Arai, T.2    Hermansky, H.3    Pavel, M.4
  • 40
    • 0027957839 scopus 로고
    • Effect of temporal envelope smearing on speech reception
    • Feb
    • R. Drullman, J. M. Festen, and R. Plomp, "Effect of temporal envelope smearing on speech reception," in J. Acoust. Soc. Amer. (JASA), Feb. 1994, vol. 95, no. 2, pp. 1053-1064.
    • (1994) J. Acoust. Soc. Amer. (JASA) , vol.95 , Issue.2 , pp. 1053-1064
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 43
    • 0002788784 scopus 로고    scopus 로고
    • Signal processing for robust speech recognition
    • C.-H. Lee and F. Soong, Eds. Boston, MA: Kluwer
    • R. M. Stern, A. Acero, F.-H. Liu, and Y. Ohshima, "Signal processing for robust speech recognition," in Speech Recognit., C.-H. Lee and F. Soong, Eds. Boston, MA: Kluwer, 1996, pp. 351-378.
    • (1996) Speech Recognit , pp. 351-378
    • Stern, R.M.1    Acero, A.2    Liu, F.-H.3    Ohshima, Y.4
  • 44
    • 0003434858 scopus 로고    scopus 로고
    • Perceptually inspired signal-processing strategies fro robust speech recognition in reverberant environments,
    • Ph.D. dissertation, Univ. California, Berkeley
    • B. E. D. Kingsbury, "Perceptually inspired signal-processing strategies fro robust speech recognition in reverberant environments," Ph.D. dissertation, Univ. California, Berkeley, 1998.
    • (1998)
    • Kingsbury, B.E.D.1
  • 46
    • 84873312246 scopus 로고
    • A review of the MTF concept in room acoustics and its use for estimating speech intelligibility
    • March
    • T. Houtgast and H. J. M. Steeneken, "A review of the MTF concept in room acoustics and its use for estimating speech intelligibility," J. Acoust. Soc. Amer. (JASA), vol. 77, no. 3, pp. 1069-1077, March 1985.
    • (1985) J. Acoust. Soc. Amer. (JASA) , vol.77 , Issue.3 , pp. 1069-1077
    • Houtgast, T.1    Steeneken, H.J.M.2
  • 47
    • 0038669544 scopus 로고    scopus 로고
    • The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions
    • Sep
    • H. G. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions," in ICSA ITRW ASR 2000, Sep. 2000.
    • (2000) ICSA ITRW ASR 2000
    • Hirsch, H.G.1    Pearce, D.2
  • 48
    • 64149119352 scopus 로고    scopus 로고
    • Motorola Au/374/01, Small Vocabulary Evaluation: Baseline mel-cepstrum Performances With Speech Endpoints Oct. 2001.
    • Motorola Au/374/01, Small Vocabulary Evaluation: Baseline mel-cepstrum Performances With Speech Endpoints Oct. 2001.
  • 49
    • 64149116336 scopus 로고    scopus 로고
    • Blind MVA speech feature processing on Aurora 2.0 Dept. Elect. Eng., Univ. Washington, Seattle, WA
    • Tech. Rep. UWEETR-2004-0017, Online, Available
    • C.-P. Chen, J. Bilmes, and D. Ellis, Blind MVA speech feature processing on Aurora 2.0 Dept. Elect. Eng., Univ. Washington, Seattle, WA, Tech. Rep. UWEETR-2004-0017, 2004 [Online]. Available: http://www.ee.washington.edu/ techsite/papers
    • (2004)
    • Chen, C.-P.1    Bilmes, J.2    Ellis, D.3
  • 50
    • 64149109407 scopus 로고    scopus 로고
    • MVA processing of speech features Dept. Elect. Eng., Univ. Washington, Seattle, WA
    • Tech. Rep. UWEETR- 2003-0024, Online, Available
    • C.-P. Chen and J. Bilmes, MVA processing of speech features Dept. Elect. Eng., Univ. Washington, Seattle, WA, Tech. Rep. UWEETR- 2003-0024, 2003 [Online]. Available: http://www.ee.washington.edu/ techsite/papers
    • (2003)
    • Chen, C.-P.1    Bilmes, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.