메뉴 건너뛰기




Volumn 49, Issue 6, 2007, Pages 514-525

Using multiple acoustic feature sets for speech recognition

Author keywords

Acoustic feature extraction; Articulatory features; Auditory features; Discriminative model combination; Linear discriminant analysis; Spectrum derivative feature; Voicing

Indexed keywords

DISCRIMINANT ANALYSIS; FEATURE EXTRACTION; SPECTRUM ANALYSIS; SPEECH RECOGNITION;

EID: 34250015828     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2007.04.005     Document Type: Article
Times cited : (25)

References (29)
  • 1
    • 0016071180 scopus 로고
    • A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis
    • Gray A.H., Markel J., and Jun J.D. A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis. IEEE Trans. Acoust. Speech Signal Process. 22 3 (1974) 207-217
    • (1974) IEEE Trans. Acoust. Speech Signal Process. , vol.22 , Issue.3 , pp. 207-217
    • Gray, A.H.1    Markel, J.2    Jun, J.D.3
  • 2
    • 0016962193 scopus 로고
    • A pattern recognition approach to voiced-unvoiced-silence classification with application to speech recognition
    • Atal B.S., and Rabiner L.R. A pattern recognition approach to voiced-unvoiced-silence classification with application to speech recognition. IEEE Trans. Acoust. Speech Signal Process. 24 3 (1976) 201-212
    • (1976) IEEE Trans. Acoust. Speech Signal Process. , vol.24 , Issue.3 , pp. 201-212
    • Atal, B.S.1    Rabiner, L.R.2
  • 3
    • 0030635572 scopus 로고    scopus 로고
    • Beyerlein, P., 1997. Discriminative model combination. In: Proc. IEEE Automatic Speech Recognition and Understanding Workshop. Santa Barbara, CA, pp. 238-245.
  • 4
    • 0031625499 scopus 로고    scopus 로고
    • Beyerlein, P., 1998. Discriminative model combination. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Seattle, WA, pp. 481-484.
  • 5
    • 34249983733 scopus 로고    scopus 로고
    • Beyerlein, P., 2000. Diskriminative modellkombination in spracherkennungssystemen mit grossem wortschatz. Ph.D. thesis, RWTH Aachen University.
  • 6
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis S., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Tran. Acoust. Speech Signal Process ASSP 28 4 (1980) 357-366
    • (1980) IEEE Tran. Acoust. Speech Signal Process ASSP , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 7
    • 4544351495 scopus 로고    scopus 로고
    • Graciarena, M., Franco, H., Zheng, J., Vergyri, D., Stolcke, A., 2004. Voicing feature integration in sri's decipher lvcsr system. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Montreal, Canada, pp. 921-924.
  • 8
    • 0034841228 scopus 로고    scopus 로고
    • Gu, L., Rose, K., 2001. Perceptual harmonic cepstral coefficients for speech recognition in noisy environment. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing. Salt Lake City, UT, pp. 125-128.
  • 9
    • 34249996907 scopus 로고    scopus 로고
    • Häb-Umbach, R., Loog, M., 1999. An investigation of cepstral parameterisations for large vocabulary speech recognition. In: Proc. European Conf. on Speech Communication and Technology, Vol. 3. Budapest, Hungary, pp. 1323-1326.
  • 10
    • 85017287487 scopus 로고    scopus 로고
    • Häb-Umbach, R., Ney, H., 1992. Linear discriminant analysis for improved large vocabulary continuous speech recognition. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. San Francisco, CA, pp. 13-16.
  • 11
    • 33646756506 scopus 로고    scopus 로고
    • Hegde, R.M., Murthy, H.A., Gadde, V., 2005. Speech processing using joint features derived from the modified group delay function. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Philadelphia, PA, pp. 541-544.
  • 12
    • 0025041264 scopus 로고
    • Perceptual linear predictive (plp) analysis of speech
    • Hermansky H. Perceptual linear predictive (plp) analysis of speech. J. Acoust. Soc. Amer. 87 4 (1990) 1738-1752
    • (1990) J. Acoust. Soc. Amer. , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 13
    • 34249988683 scopus 로고    scopus 로고
    • Holmes, J.N., Holmes, W.J., Garner, P.N., 1997. Using formant frequencies in speech recognition. In: Proc. European Conf. on Speech Communication and Technology, Vol. 4. Rhodes, Greece, pp. 2083-2086.
  • 14
    • 4544250680 scopus 로고    scopus 로고
    • Ishizuka, K., Miyazaki, N., 2004. Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Montreal, Canada, pp. 141-144.
  • 15
    • 33745194599 scopus 로고    scopus 로고
    • Kocharov, D., Zolnay, A., Schlüter, R., Ney, H., 2005. Articulatory motivated acoustic features for speech recognition. In: Proc. European Conf. on Speech Communication and Technology, Vol. 2. Lisboa, Portugal, pp. 1101-1104.
  • 16
    • 0029747183 scopus 로고    scopus 로고
    • Lee, L., Rose, R., 1996. Speaker normalization using efficient frequency warping procedures. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Atlanta, GA, pp. 353-356.
  • 17
    • 0034817674 scopus 로고    scopus 로고
    • Frequency and time filtering of filter-bank energies for robust hmm speech recognition
    • Nadeu C., Macho D., and Hernando J. Frequency and time filtering of filter-bank energies for robust hmm speech recognition. Speech Comm. 34 (2001) 93-114
    • (2001) Speech Comm. , vol.34 , pp. 93-114
    • Nadeu, C.1    Macho, D.2    Hernando, J.3
  • 18
    • 34250016428 scopus 로고    scopus 로고
    • Paliwal, K., 1999. Decorrelated and liftered filter-bank energies for robust speech recognition. In: Proc. European Conf. on Speech Communication and Technology. Budapest, Hungary, pp. 85-88.
  • 19
    • 13544259544 scopus 로고    scopus 로고
    • On the usefulness of stft phase spectrum in human listening tests
    • Paliwal K., and Alsteris L. On the usefulness of stft phase spectrum in human listening tests. Speech Comm. 45 (2005) 153-170
    • (2005) Speech Comm. , vol.45 , pp. 153-170
    • Paliwal, K.1    Alsteris, L.2
  • 20
    • 34249996594 scopus 로고    scopus 로고
    • Rabiner, L.R., Schafer, R.W., 1979. Digital Processing of Speech Signals, Prentice-Hall Signal Processing Series, Englewood Cliffs, NJ.
  • 21
    • 0034843163 scopus 로고    scopus 로고
    • Schlüter, R., Ney, H., 2001. Using phase spectrum information for improved speech recognition performance. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Salt Lake City, UT, pp. 133-136.
  • 22
    • 0031643033 scopus 로고    scopus 로고
    • Thomson, D., Chengalvarayan, R., 1998. Use of periodicity and jitter as speech recognition feature. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Seattle, WA, pp. 21-24.
  • 23
    • 17544404143 scopus 로고    scopus 로고
    • Tolba, H., Selouani, S.A., O'Shaughnessy, D., 2002. Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multi-stream paradigm. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 1. Orlando, FL, pp. 837-840.
  • 24
    • 0017482612 scopus 로고    scopus 로고
    • Wakita, H., 1977. Normalization of vowels by vocal tract length and its application to vowel identification. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. ASSP-25. pp. 183-192.
  • 25
    • 0029746535 scopus 로고    scopus 로고
    • Welling, L., Ney, H., 1996. A model for efficient formant estimation. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 2. Atlanta, GA, pp. 797-800.
  • 26
    • 0036753897 scopus 로고    scopus 로고
    • Speaker adaptive modeling by vocal tract normalization
    • Welling L., Kanthak S., and Ney H. Speaker adaptive modeling by vocal tract normalization. IEEE Trans. Speech Audio Process. 10 6 (2002) 415-426
    • (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.6 , pp. 415-426
    • Welling, L.1    Kanthak, S.2    Ney, H.3
  • 27
    • 0030643667 scopus 로고    scopus 로고
    • Woodland, P., Gales, M., Pye, D., Young, S., 1997. Broadcast news transcription using htk. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Vol. 2. Munich, Germany, pp. 719-722.
  • 28
    • 85009247781 scopus 로고    scopus 로고
    • Zolnay, A., Schlüter, R., Ney, H., 2002. Robust speech recognition using a voiced-unvoiced feature. In: Proc. Int. Conf. on Spoken Language Processing, Vol. 2. Denver, CO, pp. 1065-1068.
  • 29
    • 85009188485 scopus 로고    scopus 로고
    • Zolnay, A., Schlüter, R., Ney, H., 2003. Extraction methods of voicing feature for robust speech recognition. In: Proc. European Conf. on Speech Communication and Technology, vol. 1. Geneva, Switzerland, pp. 497-500.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.