SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 50, Issue 2, 2008, Pages 142-152

A new perceptually motivated MVDR-based acoustic front-end (PMVDR) for robust automatic speech recognition

(2) Yapanel, Umit H a Hansen, John H L a

a The University of Texas at Dallas (United States)

Author keywords

Acoustic feature extraction; Noise robustness analysis; Robust speech recognition

Indexed keywords

ALGORITHMS; ERROR ANALYSIS; FEATURE EXTRACTION; ROBUST CONTROL; SENSORY PERCEPTION; VOCABULARY CONTROL;

AUTOMATIC SPEECH RECOGNITION (ASR); MINIMUM VARIANCE DISTORTIONLESS RESPONSE (MVDR); WORD ERROR RATE (WER);

SPEECH RECOGNITION;

EID: 37649022051 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2007.07.006 Document Type: Article

Times cited : (87)

References (36)

1
- 0030165438
- Language accent classification in American English
- Arslan L.M., and Hansen J.H.L. Language accent classification in American English. Speech Comm. 18 4 (1996) 353-367
- (1996) Speech Comm. , vol.18 , Issue.4 , pp. 353-367
- Arslan, L.M.¹ Hansen, J.H.L.²

2
- 0034229795
- A comparative study of traditional and newly proposed features for recognition of speech under stress
- Bou-Ghazale S.E., and Hansen J.H.L. A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Trans. Speech and Audio Processing 8 (2000) 429-442
- (2000) IEEE Trans. Speech and Audio Processing , vol.8 , pp. 429-442
- Bou-Ghazale, S.E.¹ Hansen, J.H.L.²

3
- 37649016122
- CSLR, 2004. http://cslr.colorado.edu.

4
- 37649005890
- CU-Move, 2004. http://cumove.colorado.edu (also at http://crss.utdallas.edu).

5
- 0019053271
- Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences
- Davis S.B., and Mermelstein P. Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustic Speech and Signal Processing 28 (1980) 357-366
- (1980) IEEE Trans. Acoustic Speech and Signal Processing , vol.28 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

6
- 0034842452
- Dharanipragada, S., Rao, B.D., 2001. MVDR-based Feature Extraction for Robust Speech Recognition, IEEE ICASSP-01: Inter. Conf. Acoust. Speech, Sig. Proc., pp. 3009-12, Salt Lake City, Utah.

7
- 0026106454
- Discrete all-pole modeling
- El-Jaroudi A., and Makhoul J. Discrete all-pole modeling. IEEE Trans. Signal Process. 39 (1991) 411-423
- (1991) IEEE Trans. Signal Process. , vol.39 , pp. 411-423
- El-Jaroudi, A.¹ Makhoul, J.²

8
- 85009085047
- Gu, L, Rose, K., 2001. Split-band perceptual harmonic cepstral coefficients as acoustic features for speech recognition, ISCA Interspeech-01/EUROSPEECH-01, Aalbrg, Denmark, pp. 583-586.

9
- 0030283741
- Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition, speech communication
- Hansen J.H.L. Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition, speech communication. Special Issue on Speech Under Stress 20 2 (1996) 151-170
- (1996) Special Issue on Speech Under Stress , vol.20 , Issue.2 , pp. 151-170
- Hansen, J.H.L.¹

10
- 33745220319
- Hansen, J.H.L., Angkititrakul, P., Plucienkowski, J., Gallant, S., Yapanel, U., Pellom, B., Ward, W., Cole, R., 2001a. CU-Move: Analysis & Corpus Development for Interactive In-vehicle Speech Systems, Interspeech-01/EUROSPEECH-01, Vol. 3, Aalborg, Denmark, pp. 2023-2026.

11
- 85009104903
- Hansen, J.H.L., Sarikaya, R., Yapanel, U., Pellom, B., 2001b. Robust Speech Recognition in Noise: An Evaluation using the SPINE Corpus, Interspeech-01/EUROSPEECH-01, Vol. 2, Aalborg, Denmark, pp. 905-908.

12
- 37649006484
- Hansen, J.H.L., Bou-Ghazale, S.E., 1997. Getting Started with SUSAS: A Speech Under Simulated and Actual Stress database, ISCA EUROSPEECH-95, Rhodes, Greece, pp. 1743-1746.

13
- 0003807773
- Prentice-Hall, Englewood Cliffs, NJ
- Haykin S. Adaptive Filter Theory (1991), Prentice-Hall, Englewood Cliffs, NJ
- (1991) Adaptive Filter Theory
- Haykin, S.¹

14
- 0025041264
- Perceptual Linear Prediction PLP Analysis of Speech
- Hermansky H. Perceptual Linear Prediction PLP Analysis of Speech. J. Acoustic. Soc. Am. 87 4 (1990) 1738-1752
- (1990) J. Acoustic. Soc. Am. , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

15
- 0004056285
- Prentice-Hall PTR, Upper Saddle River, New Jersey
- Huang X., Acero A., and Hon H. Spoken Language Processing: A Guide to Theory, Algorithm and System Development (2001), Prentice-Hall PTR, Upper Saddle River, New Jersey
- (2001) Spoken Language Processing: A Guide to Theory, Algorithm and System Development
- Huang, X.¹ Acero, A.² Hon, H.³

16
- 37649025617
- Keystone, Colorado pp. 17-26
- Hunt M.J. Spectral Signal Processing for ASR Vol. 1 (1999), Keystone, Colorado pp. 17-26
- (1999) Spectral Signal Processing for ASR , vol.1
- Hunt, M.J.¹

17
- 0032677440
- Frequency-domain Spectral Envelope Estimation for Low Rate Coding of Speech
- Phoenix, Arizona
- Jelinek M., and Adoul J.P. Frequency-domain Spectral Envelope Estimation for Low Rate Coding of Speech. IEEE ICASSP-99: Inter. Conf. Acoust. Speech, Sig. Proc. (1999), Phoenix, Arizona 1818-1821
- (1999) IEEE ICASSP-99: Inter. Conf. Acoust. Speech, Sig. Proc. , pp. 1818-1821
- Jelinek, M.¹ Adoul, J.P.²

18
- 37649019116
- LDC-SUSAS, 2004, http://wave.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId = LDC99S78.

19
- 37649015862
- LDC-WSJ, 2004, http://wave.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId = LDC93S6A.

20
- 0016495091
- Linear prediction: a tutorial review
- Makhoul J. Linear prediction: a tutorial review. Proceedings of the IEEE 63 (1975) 561-580
- (1975) Proceedings of the IEEE , vol.63 , pp. 561-580
- Makhoul, J.¹

21
- 37649021734
- McDonough, J., Byrne, W., Luo, X., 1998. Speaker Normalization with All-pass Transforms, ISCA ICSLP-98: Internat. Conf. Spoken Lang. Proc., Sydney, Australia.

22
- 0000473547
- All-pole modeling of speech based on the minimum variance distortionless response spectrum
- Murthi M.N., and Rao B.D. All-pole modeling of speech based on the minimum variance distortionless response spectrum. IEEE Trans. Acoustic Speech Signal Process. 8 3 (2000) 221-239
- (2000) IEEE Trans. Acoustic Speech Signal Process. , vol.8 , Issue.3 , pp. 221-239
- Murthi, M.N.¹ Rao, B.D.²

23
- 0022136506
- Fast MLM power spectrum estimation from uniformly spaced correlations
- Musicus B.R. Fast MLM power spectrum estimation from uniformly spaced correlations. IEEE Trans. Acoustics Speech Signal Process. 33 (1985) 133-135
- (1985) IEEE Trans. Acoustics Speech Signal Process. , vol.33 , pp. 133-135
- Musicus, B.R.¹

24
- 37649019833
- NIST SPHERE Software Package, 2004, www.nist.gov.

25
- 0003513556
- Prentice-Hall, Englewood Cliffs, NJ
- Oppenheim A.V., and Schafer R.W. Discrete-time Signal Processing (1989), Prentice-Hall, Englewood Cliffs, NJ
- (1989) Discrete-time Signal Processing
- Oppenheim, A.V.¹ Schafer, R.W.²

26
- 37649004058
- Pellom, B., 2001. SONIC: The University of Colorado Continuous Speech Recognizer, TR-CSLR-2001-01, Boulder, Colorado.

27
- 0141591620
- Pellom, B., Hacioglu, K., 2003. Recent Improvements in the CU SONIC ASR System for Noisy Speech: The SPINE Task, IEEE ICASSP-03: Inter. Conf. Acoust. Speech, Sig. Proc., Hong Kong, pp. 4-7.

28
- 0001481529
- Bark and ERB bilinear transforms
- Smith J.O., and Abel J.S. Bark and ERB bilinear transforms. IEEE Trans. Speech Audio Process. 7 6 (1999) 697-708
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.6 , pp. 697-708
- Smith, J.O.¹ Abel, J.S.²

29
- 37649002753
- Tokuda, K., Masuko, T., Kobayashi, T., Imai, S., 1994. Mel-generalized Cepstral Analysis-A Unified Approach to Speech Spectral Estimation, ISCA ICSLP-94: Inter. Conf. Spoken Lang. Proc., Yokohama, Japan, pp. 1043-1046.

30
- 85009198067
- Wolfel, M., McDonough, J., Waibel, A. 2003. Minimum Variance Distortionless Response on a Warped Frequency Scale, ISCA Interspeech-03/EUROSPEECH-03, Geneva, Switzerland, pp. 1021-1024.

31
- 37649017653
- Yapanel, U., 2005. Acoustic Modeling and Speaker Normalization Strategies with Application to Robust In-Vehicle Speech Recognition and Dialect Classification, PhD Thesis, Robust Speech Processing Group - CSLR, University of Colorado at Boulder.

32
- 85009266810
- Yapanel, U., Zhang, X., Hansen, J.H.L., 2002. High Performance Digit Recognition in Real Car Environments, ISCA Interspeech-02/ICSLP-02, Denver, Colorado, pp. 793-796.

33
- 85009164449
- Yapanel, U.H., Hansen, J.H.L., 2003. A New Perspective on Feature Extraction for Robust In-Vehicle Speech Recognition, ISCA Interspeech-03/EUROSPEECH-03, Geneva, Switzerland, pp. 1281-1284.

34
- 33646765432
- Yapanel, U.H., Hansen, J.H.L., 2005. Towards an Intelligent Acoustic Front-end for Automatic Speech Recognition: Built-in Speaker Normalization (BISN), IEEE ICASSP-05: Internat. Conf. Acoust. Speech, Sig. Proc., Philadelphia, USA.

35
- 0141702336
- Yapanel, U.H., Dharanipragada, S., 2003. Perceptual MVDR-Based Cepstral Coefficients (PMCCs) for Noise Robust Speech Recognition, IEEE ICASSP-03: Internat. Conf. Acoust. Speech, Sig. Proc., Hong Kong, pp. 644-647.

36
- 85009160132
- Yapanel, U.H., Dharanipragada, S., Hansen, J.H.L., 2003. Perceptual MVDR-Based Cepstral Coefficients (PMCCs) for High Accuracy Speech Recognition, ISCA Interspeech-03/EUROSPEECH-03, Geneva, Switzerland, pp. 1829-1832.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.