메뉴 건너뛰기




Volumn 48, Issue 11, 2006, Pages 1447-1457

A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition

Author keywords

Aperiodicity; Noise robust frontend; Periodicity; Speech feature; Subband

Indexed keywords

DATABASE SYSTEMS; FREQUENCY DOMAIN ANALYSIS; PARAMETER ESTIMATION; ROBUSTNESS (CONTROL SYSTEMS); SPEECH ANALYSIS; SPEECH RECOGNITION;

EID: 33750352847     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2006.06.008     Document Type: Article
Times cited : (17)

References (39)
  • 2
    • 0030037151 scopus 로고    scopus 로고
    • Cepstral representation of speech motivated by time-frequency masking: an application to speech recognition
    • Aikawa K., Singer H., Kawahara H., and Tohkura Y. Cepstral representation of speech motivated by time-frequency masking: an application to speech recognition. J. Acoust. Soc. Am. 100 (1996) 603-614
    • (1996) J. Acoust. Soc. Am. , vol.100 , pp. 603-614
    • Aikawa, K.1    Singer, H.2    Kawahara, H.3    Tohkura, Y.4
  • 3
    • 0036649309 scopus 로고    scopus 로고
    • Robust auditory-based speech processing using the average localized synchrony detection
    • Ali A.M., Spiegel J.V., and Mueller P. Robust auditory-based speech processing using the average localized synchrony detection. IEEE Trans. Speech Audio Process. 10 (2002) 279-292
    • (2002) IEEE Trans. Speech Audio Process. , vol.10 , pp. 279-292
    • Ali, A.M.1    Spiegel, J.V.2    Mueller, P.3
  • 4
    • 0016067897 scopus 로고
    • Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
    • Atal B.S. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55 (1974) 1304-1312
    • (1974) J. Acoust. Soc. Am. , vol.55 , pp. 1304-1312
    • Atal, B.S.1
  • 5
    • 4544310318 scopus 로고    scopus 로고
    • Can back-ends be more robust than front-ends? Investigation over the Aurora-2 database
    • Bernard A., Gong Y., and Cui X. Can back-ends be more robust than front-ends? Investigation over the Aurora-2 database. Proc. ICASSP 1 (2004) 1025-1028
    • (2004) Proc. ICASSP , vol.1 , pp. 1025-1028
    • Bernard, A.1    Gong, Y.2    Cui, X.3
  • 6
    • 0018320733 scopus 로고
    • Enhancement of speech corrupted by acoustical noise
    • Berouti M., Schwartz R., and Makhoul J. Enhancement of speech corrupted by acoustical noise. Proc. ICASSP (1979) 208-211
    • (1979) Proc. ICASSP , pp. 208-211
    • Berouti, M.1    Schwartz, R.2    Makhoul, J.3
  • 7
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectral subtraction
    • Boll S. Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. ASSP-27 (1979) 113-120
    • (1979) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-27 , pp. 113-120
    • Boll, S.1
  • 8
    • 85009265586 scopus 로고    scopus 로고
    • Frontend post-processing and backend model enhancement on the AURORA 2.0/3.0 databases
    • Chen C.P., Filali K., and Bilmes J.A. Frontend post-processing and backend model enhancement on the AURORA 2.0/3.0 databases. Proc. ICSLP (2002) 241-244
    • (2002) Proc. ICSLP , pp. 241-244
    • Chen, C.P.1    Filali, K.2    Bilmes, J.A.3
  • 9
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis S.B., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. ASSP-28 (1980) 357-366
    • (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-28 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 10
    • 0030962746 scopus 로고    scopus 로고
    • Concurrent vowel identification. II. Effects of phase, harmonicity, and task
    • de Cheveigné A., McAdams S., and Marin C.M.H. Concurrent vowel identification. II. Effects of phase, harmonicity, and task. J. Acoust. Soc. Am. 101 (1997) 2848-2856
    • (1997) J. Acoust. Soc. Am. , vol.101 , pp. 2848-2856
    • de Cheveigné, A.1    McAdams, S.2    Marin, C.M.H.3
  • 11
    • 33750327305 scopus 로고    scopus 로고
    • ETSI standard document, 2003. Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms. ETSI ES 202 050 v1.1.3.
  • 12
    • 85135375893 scopus 로고
    • HMM recognition in noise using parallel model combination
    • Gales M., and Young S. HMM recognition in noise using parallel model combination. Proc. Eurospeech (1993) 837-840
    • (1993) Proc. Eurospeech , pp. 837-840
    • Gales, M.1    Young, S.2
  • 14
    • 84928838192 scopus 로고
    • Temporal non-place information in the auditory nerve firing patterns as a front-end for speech recognition in a noisy environment
    • Ghitza O. Temporal non-place information in the auditory nerve firing patterns as a front-end for speech recognition in a noisy environment. J. Phonetics 16 (1988) 109-124
    • (1988) J. Phonetics , vol.16 , pp. 109-124
    • Ghitza, O.1
  • 15
    • 0029288202 scopus 로고
    • Speech recognition in noisy environments: a survey
    • Gong Y. Speech recognition in noisy environments: a survey. Speech Comm. 16 (1995) 261-291
    • (1995) Speech Comm. , vol.16 , pp. 261-291
    • Gong, Y.1
  • 16
    • 33745742672 scopus 로고    scopus 로고
    • Speech processing in the auditory system: an overview
    • Greenberg S., Ainsworth W.A., Popper A.N., and Fay R.R. (Eds), Springer-Verlag, New York
    • Greenberg S. Speech processing in the auditory system: an overview. In: Greenberg S., Ainsworth W.A., Popper A.N., and Fay R.R. (Eds). Speech Processing in the Auditory System (2004), Springer-Verlag, New York
    • (2004) Speech Processing in the Auditory System
    • Greenberg, S.1
  • 17
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Hermansky H. Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87 (1990) 1738-1752
    • (1990) J. Acoust. Soc. Am. , vol.87 , pp. 1738-1752
    • Hermansky, H.1
  • 19
    • 33750311358 scopus 로고    scopus 로고
    • Hirsh, H.G., Pearce, D., 2000. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In Proc. of the ISCA Tutorial and Research Workshop on Automatic Speech Recognition (ISCA ITRW ASR), pp. 181-188.
  • 20
    • 4544250680 scopus 로고    scopus 로고
    • Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition
    • Ishizuka K., and Miyazaki N. Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition. Proc. ICASSP 1 (2004) 141-144
    • (2004) Proc. ICASSP , vol.1 , pp. 141-144
    • Ishizuka, K.1    Miyazaki, N.2
  • 21
    • 33750352886 scopus 로고    scopus 로고
    • Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition
    • Ishizuka K., Miyazaki N., Nakatani T., and Minami Y. Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition. Proc. ICSLP 2 (2004) 937-940
    • (2004) Proc. ICSLP , vol.2 , pp. 937-940
    • Ishizuka, K.1    Miyazaki, N.2    Nakatani, T.3    Minami, Y.4
  • 22
    • 0016467604 scopus 로고
    • Minimum prediction residual principle applied to speech recognition
    • Itakura F. Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. ASSP-23 (1975) 67-72
    • (1975) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-23 , pp. 67-72
    • Itakura, F.1
  • 24
    • 0028996914 scopus 로고
    • Robust feature extraction using SBCOR analysis
    • Kajita S., and Itakura F. Robust feature extraction using SBCOR analysis. Proc. ICASSP (1995) 421-424
    • (1995) Proc. ICASSP , pp. 421-424
    • Kajita, S.1    Itakura, F.2
  • 25
    • 0032785783 scopus 로고    scopus 로고
    • Auditory processing of speech signals for robust speech recognition in real-world noisy environments
    • Kim D.S., Lee S.Y., and Kil R.M. Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Trans. Speech Audio Process. 7 (1999) 55-69
    • (1999) IEEE Trans. Speech Audio Process. , vol.7 , pp. 55-69
    • Kim, D.S.1    Lee, S.Y.2    Kil, R.M.3
  • 26
    • 85009115888 scopus 로고    scopus 로고
    • An auditory system-based feature for robust speech recognition
    • Li Q., Soong F.K., and Siohan O. An auditory system-based feature for robust speech recognition. Proc. Eurospeech (2001) 619-621
    • (2001) Proc. Eurospeech , pp. 619-621
    • Li, Q.1    Soong, F.K.2    Siohan, O.3
  • 28
    • 0026882842 scopus 로고
    • Experiments with a nonlinear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars
    • Lockwood P., and Boudy J. Experiments with a nonlinear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars. Speech Comm. 11 (1992) 215-228
    • (1992) Speech Comm. , vol.11 , pp. 215-228
    • Lockwood, P.1    Boudy, J.2
  • 30
    • 4544222091 scopus 로고    scopus 로고
    • Blind equalization in the cepstral domain for robust telephone based speech recognition
    • Mauuary L. Blind equalization in the cepstral domain for robust telephone based speech recognition. Proc. EUSIPCO 1 (1998) 359-362
    • (1998) Proc. EUSIPCO , vol.1 , pp. 359-362
    • Mauuary, L.1
  • 31
    • 0028996915 scopus 로고
    • A maximum likelihood procedure for a universal adaptation method
    • Minami Y., and Furui S. A maximum likelihood procedure for a universal adaptation method. Proc. ICASSP (1995) 129-132
    • (1995) Proc. ICASSP , pp. 129-132
    • Minami, Y.1    Furui, S.2
  • 32
    • 0020816083 scopus 로고
    • Suggested formula for calculating auditory-filter bandwidths and excitation patterns
    • Moore B.C.J., and Glasberg B.R. Suggested formula for calculating auditory-filter bandwidths and excitation patterns. J. Acoust. Soc. Am. 74 (1983) 750-753
    • (1983) J. Acoust. Soc. Am. , vol.74 , pp. 750-753
    • Moore, B.C.J.1    Glasberg, B.R.2
  • 33
    • 33646799204 scopus 로고    scopus 로고
    • Nakamura, S., Yamamoto, K., Takeda, K., Kuroiwa, S., Kitaoka, N., Yamada, T., Mizumachi, M., Nishiura, T., Fujimoto, M., Saso, A., Endo, T., 2003. Data collection and evaluation of AURORA-2 Japanese corpus. In: Proc. of the 8th IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 619-623.
  • 35
    • 0016938506 scopus 로고
    • Auditory filter shapes derived with noise stimuli
    • Patterson R.D. Auditory filter shapes derived with noise stimuli. J. Acoust. Soc. Am. 59 (1976) 640-654
    • (1976) J. Acoust. Soc. Am. , vol.59 , pp. 640-654
    • Patterson, R.D.1
  • 36
    • 0001050571 scopus 로고
    • Auditory filters and excitation patterns as representations of frequency resolution
    • Moore B.C.J. (Ed), Academic Press, London
    • Patterson R.D., and Moore B.C.J. Auditory filters and excitation patterns as representations of frequency resolution. In: Moore B.C.J. (Ed). Frequency Selectivity in Hearing (1986), Academic Press, London 23-177
    • (1986) Frequency Selectivity in Hearing , pp. 23-177
    • Patterson, R.D.1    Moore, B.C.J.2
  • 37
    • 84987702417 scopus 로고    scopus 로고
    • The AURORA experimental framework for the performance evaluation of speech recognition systems under noise conditions
    • Pearce D., and Hirsh H.G. The AURORA experimental framework for the performance evaluation of speech recognition systems under noise conditions. Proc. ICSLP 4 (2000) 29-32
    • (2000) Proc. ICSLP , vol.4 , pp. 29-32
    • Pearce, D.1    Hirsh, H.G.2
  • 38
    • 0017367712 scopus 로고
    • On the use of autocorrelation analysis for pitch detection
    • Rabiner L.R. On the use of autocorrelation analysis for pitch detection. IEEE Trans. Acoust. Speech Signal Process. 25 (1977) 24-33
    • (1977) IEEE Trans. Acoust. Speech Signal Process. , vol.25 , pp. 24-33
    • Rabiner, L.R.1
  • 39
    • 84928837806 scopus 로고
    • A joint synchrony/mean-rate model of auditory speech processing
    • Seneff S. A joint synchrony/mean-rate model of auditory speech processing. J. Phonetics 16 (1988) 55-76
    • (1988) J. Phonetics , vol.16 , pp. 55-76
    • Seneff, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.