SCOPUS 정보 검색 플랫폼

Volumn 48, Issue 11, 2006, Pages 1447-1457

A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition

(2) Ishizuka, Kentaro a Nakatani, Tomohiro a

a Japan (Japan)

Author keywords

Aperiodicity; Noise robust frontend; Periodicity; Speech feature; Subband

Indexed keywords

DATABASE SYSTEMS; FREQUENCY DOMAIN ANALYSIS; PARAMETER ESTIMATION; ROBUSTNESS (CONTROL SYSTEMS); SPEECH ANALYSIS; SPEECH RECOGNITION;

APERIODICITY; NOISE ROBUST FRONTEND; PERIODICITY; SPEECH FEATURE; SPEECH PERCEPTION; SUBBAND;

FEATURE EXTRACTION;

EID: 33750352847 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2006.06.008 Document Type: Article

Times cited : (17)

References (39)

1
- 85009231870
- QUALCOMM-ICSI-OGI features for ASR
- Adami A., Burget L., Duponi S., Garudadri H., Grezl F., Hermansky H., Jain P., Kajarekar S., Morgan N., and Sivadas S. QUALCOMM-ICSI-OGI features for ASR. Proc. ICSLP (2002) 21-24
- (2002) Proc. ICSLP , pp. 21-24
- Adami, A.¹ Burget, L.² Duponi, S.³ Garudadri, H.⁴ Grezl, F.⁵ Hermansky, H.⁶ Jain, P.⁷ Kajarekar, S.⁸ Morgan, N.⁹ Sivadas, S.¹⁰

2
- 0030037151
- Cepstral representation of speech motivated by time-frequency masking: an application to speech recognition
- Aikawa K., Singer H., Kawahara H., and Tohkura Y. Cepstral representation of speech motivated by time-frequency masking: an application to speech recognition. J. Acoust. Soc. Am. 100 (1996) 603-614
- (1996) J. Acoust. Soc. Am. , vol.100 , pp. 603-614
- Aikawa, K.¹ Singer, H.² Kawahara, H.³ Tohkura, Y.⁴

3
- 0036649309
- Robust auditory-based speech processing using the average localized synchrony detection
- Ali A.M., Spiegel J.V., and Mueller P. Robust auditory-based speech processing using the average localized synchrony detection. IEEE Trans. Speech Audio Process. 10 (2002) 279-292
- (2002) IEEE Trans. Speech Audio Process. , vol.10 , pp. 279-292
- Ali, A.M.¹ Spiegel, J.V.² Mueller, P.³

4
- 0016067897
- Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
- Atal B.S. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55 (1974) 1304-1312
- (1974) J. Acoust. Soc. Am. , vol.55 , pp. 1304-1312
- Atal, B.S.¹

5
- 4544310318
- Can back-ends be more robust than front-ends? Investigation over the Aurora-2 database
- Bernard A., Gong Y., and Cui X. Can back-ends be more robust than front-ends? Investigation over the Aurora-2 database. Proc. ICASSP 1 (2004) 1025-1028
- (2004) Proc. ICASSP , vol.1 , pp. 1025-1028
- Bernard, A.¹ Gong, Y.² Cui, X.³

6
- 0018320733
- Enhancement of speech corrupted by acoustical noise
- Berouti M., Schwartz R., and Makhoul J. Enhancement of speech corrupted by acoustical noise. Proc. ICASSP (1979) 208-211
- (1979) Proc. ICASSP , pp. 208-211
- Berouti, M.¹ Schwartz, R.² Makhoul, J.³

7
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Boll S. Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. ASSP-27 (1979) 113-120
- (1979) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-27 , pp. 113-120
- Boll, S.¹

8
- 85009265586
- Frontend post-processing and backend model enhancement on the AURORA 2.0/3.0 databases
- Chen C.P., Filali K., and Bilmes J.A. Frontend post-processing and backend model enhancement on the AURORA 2.0/3.0 databases. Proc. ICSLP (2002) 241-244
- (2002) Proc. ICSLP , pp. 241-244
- Chen, C.P.¹ Filali, K.² Bilmes, J.A.³

9
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Davis S.B., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. ASSP-28 (1980) 357-366
- (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-28 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

10
- 0030962746
- Concurrent vowel identification. II. Effects of phase, harmonicity, and task
- de Cheveigné A., McAdams S., and Marin C.M.H. Concurrent vowel identification. II. Effects of phase, harmonicity, and task. J. Acoust. Soc. Am. 101 (1997) 2848-2856
- (1997) J. Acoust. Soc. Am. , vol.101 , pp. 2848-2856
- de Cheveigné, A.¹ McAdams, S.² Marin, C.M.H.³

11
- 33750327305
- ETSI standard document, 2003. Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms. ETSI ES 202 050 v1.1.3.

12
- 85135375893
- HMM recognition in noise using parallel model combination
- Gales M., and Young S. HMM recognition in noise using parallel model combination. Proc. Eurospeech (1993) 837-840
- (1993) Proc. Eurospeech , pp. 837-840
- Gales, M.¹ Young, S.²

13
- 84871624012
- Auditory model based speech processing
- Gao Y., Huang T., Chen S., and Haton J.-P. Auditory model based speech processing. Proc. ICSLP (1992) 73-76
- (1992) Proc. ICSLP , pp. 73-76
- Gao, Y.¹ Huang, T.² Chen, S.³ Haton, J.-P.⁴

14
- 84928838192
- Temporal non-place information in the auditory nerve firing patterns as a front-end for speech recognition in a noisy environment
- Ghitza O. Temporal non-place information in the auditory nerve firing patterns as a front-end for speech recognition in a noisy environment. J. Phonetics 16 (1988) 109-124
- (1988) J. Phonetics , vol.16 , pp. 109-124
- Ghitza, O.¹

15
- 0029288202
- Speech recognition in noisy environments: a survey
- Gong Y. Speech recognition in noisy environments: a survey. Speech Comm. 16 (1995) 261-291
- (1995) Speech Comm. , vol.16 , pp. 261-291
- Gong, Y.¹

16
- 33745742672
- Speech processing in the auditory system: an overview
- Greenberg S., Ainsworth W.A., Popper A.N., and Fay R.R. (Eds), Springer-Verlag, New York
- Greenberg S. Speech processing in the auditory system: an overview. In: Greenberg S., Ainsworth W.A., Popper A.N., and Fay R.R. (Eds). Speech Processing in the Auditory System (2004), Springer-Verlag, New York
- (2004) Speech Processing in the Auditory System
- Greenberg, S.¹

17
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- Hermansky H. Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87 (1990) 1738-1752
- (1990) J. Acoust. Soc. Am. , vol.87 , pp. 1738-1752
- Hermansky, H.¹

18
- 0003391579
- Springer-Verlag, New York
- Hess W. Pitch Determination of Speech Signals (1983), Springer-Verlag, New York
- (1983) Pitch Determination of Speech Signals
- Hess, W.¹

19
- 33750311358
- Hirsh, H.G., Pearce, D., 2000. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In Proc. of the ISCA Tutorial and Research Workshop on Automatic Speech Recognition (ISCA ITRW ASR), pp. 181-188.

20
- 4544250680
- Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition
- Ishizuka K., and Miyazaki N. Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition. Proc. ICASSP 1 (2004) 141-144
- (2004) Proc. ICASSP , vol.1 , pp. 141-144
- Ishizuka, K.¹ Miyazaki, N.²

21
- 33750352886
- Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition
- Ishizuka K., Miyazaki N., Nakatani T., and Minami Y. Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition. Proc. ICSLP 2 (2004) 937-940
- (2004) Proc. ICSLP , vol.2 , pp. 937-940
- Ishizuka, K.¹ Miyazaki, N.² Nakatani, T.³ Minami, Y.⁴

22
- 0016467604
- Minimum prediction residual principle applied to speech recognition
- Itakura F. Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. ASSP-23 (1975) 67-72
- (1975) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-23 , pp. 67-72
- Itakura, F.¹

23
- 85009168054
- Covariation and weighting of harmonically decomposed streams for ASR
- Jackson P.J.B., Moreno D.M., Russell M.J., and Hernando J. Covariation and weighting of harmonically decomposed streams for ASR. Proc. Eurospeech (2003) 2321-2324
- (2003) Proc. Eurospeech , pp. 2321-2324
- Jackson, P.J.B.¹ Moreno, D.M.² Russell, M.J.³ Hernando, J.⁴

24
- 0028996914
- Robust feature extraction using SBCOR analysis
- Kajita S., and Itakura F. Robust feature extraction using SBCOR analysis. Proc. ICASSP (1995) 421-424
- (1995) Proc. ICASSP , pp. 421-424
- Kajita, S.¹ Itakura, F.²

25
- 0032785783
- Auditory processing of speech signals for robust speech recognition in real-world noisy environments
- Kim D.S., Lee S.Y., and Kil R.M. Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Trans. Speech Audio Process. 7 (1999) 55-69
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , pp. 55-69
- Kim, D.S.¹ Lee, S.Y.² Kil, R.M.³

26
- 85009115888
- An auditory system-based feature for robust speech recognition
- Li Q., Soong F.K., and Siohan O. An auditory system-based feature for robust speech recognition. Proc. Eurospeech (2001) 619-621
- (2001) Proc. Eurospeech , pp. 619-621
- Li, Q.¹ Soong, F.K.² Siohan, O.³

27
- 0017980972
- All-pole modeling of degraded speech
- Lim J., and Oppenheim A. All-pole modeling of degraded speech. IEEE Trans. Acoust. Speech Signal Process. ASSP-26 (1978) 197-210
- (1978) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-26 , pp. 197-210
- Lim, J.¹ Oppenheim, A.²

28
- 0026882842
- Experiments with a nonlinear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars
- Lockwood P., and Boudy J. Experiments with a nonlinear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars. Speech Comm. 11 (1992) 215-228
- (1992) Speech Comm. , vol.11 , pp. 215-228
- Lockwood, P.¹ Boudy, J.²

29
- 85009242725
- Evaluation of a noise-robust DSR front-end on AURORA databases
- Macho D., Mauuary L., Noé B., Cheng Y.M., Ealey D., Jouvet D., Kelleher H., Pearce D., and Saadoun F. Evaluation of a noise-robust DSR front-end on AURORA databases. Proc. ICSLP (2002) 17-20
- (2002) Proc. ICSLP , pp. 17-20
- Macho, D.¹ Mauuary, L.² Noé, B.³ Cheng, Y.M.⁴ Ealey, D.⁵ Jouvet, D.⁶ Kelleher, H.⁷ Pearce, D.⁸ Saadoun, F.⁹

30
- 4544222091
- Blind equalization in the cepstral domain for robust telephone based speech recognition
- Mauuary L. Blind equalization in the cepstral domain for robust telephone based speech recognition. Proc. EUSIPCO 1 (1998) 359-362
- (1998) Proc. EUSIPCO , vol.1 , pp. 359-362
- Mauuary, L.¹

31
- 0028996915
- A maximum likelihood procedure for a universal adaptation method
- Minami Y., and Furui S. A maximum likelihood procedure for a universal adaptation method. Proc. ICASSP (1995) 129-132
- (1995) Proc. ICASSP , pp. 129-132
- Minami, Y.¹ Furui, S.²

32
- 0020816083
- Suggested formula for calculating auditory-filter bandwidths and excitation patterns
- Moore B.C.J., and Glasberg B.R. Suggested formula for calculating auditory-filter bandwidths and excitation patterns. J. Acoust. Soc. Am. 74 (1983) 750-753
- (1983) J. Acoust. Soc. Am. , vol.74 , pp. 750-753
- Moore, B.C.J.¹ Glasberg, B.R.²

33
- 33646799204
- Nakamura, S., Yamamoto, K., Takeda, K., Kuroiwa, S., Kitaoka, N., Yamada, T., Mizumachi, M., Nishiura, T., Fujimoto, M., Saso, A., Endo, T., 2003. Data collection and evaluation of AURORA-2 Japanese corpus. In: Proc. of the 8th IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 619-623.

34
- 24144494616
- AURORA-2J: an evaluation framework for Japanese noisy speech recognition
- Nakamura S., Takeda K., Yamamoto K., Yamada T., Kuroiwa S., Kitaoka N., Nishiura T., Sasou A., Mizumachi M., Miyajima C., Fujimoto M., and Endo T. AURORA-2J: an evaluation framework for Japanese noisy speech recognition. IEICE Trans. Inform. Systems E88-D (2005) 535-544
- (2005) IEICE Trans. Inform. Systems , vol.E88-D , pp. 535-544
- Nakamura, S.¹ Takeda, K.² Yamamoto, K.³ Yamada, T.⁴ Kuroiwa, S.⁵ Kitaoka, N.⁶ Nishiura, T.⁷ Sasou, A.⁸ Mizumachi, M.⁹ Miyajima, C.¹⁰ Fujimoto, M.¹¹ Endo, T.¹²

35
- 0016938506
- Auditory filter shapes derived with noise stimuli
- Patterson R.D. Auditory filter shapes derived with noise stimuli. J. Acoust. Soc. Am. 59 (1976) 640-654
- (1976) J. Acoust. Soc. Am. , vol.59 , pp. 640-654
- Patterson, R.D.¹

36
- 0001050571
- Auditory filters and excitation patterns as representations of frequency resolution
- Moore B.C.J. (Ed), Academic Press, London
- Patterson R.D., and Moore B.C.J. Auditory filters and excitation patterns as representations of frequency resolution. In: Moore B.C.J. (Ed). Frequency Selectivity in Hearing (1986), Academic Press, London 23-177
- (1986) Frequency Selectivity in Hearing , pp. 23-177
- Patterson, R.D.¹ Moore, B.C.J.²

37
- 84987702417
- The AURORA experimental framework for the performance evaluation of speech recognition systems under noise conditions
- Pearce D., and Hirsh H.G. The AURORA experimental framework for the performance evaluation of speech recognition systems under noise conditions. Proc. ICSLP 4 (2000) 29-32
- (2000) Proc. ICSLP , vol.4 , pp. 29-32
- Pearce, D.¹ Hirsh, H.G.²

38
- 0017367712
- On the use of autocorrelation analysis for pitch detection
- Rabiner L.R. On the use of autocorrelation analysis for pitch detection. IEEE Trans. Acoust. Speech Signal Process. 25 (1977) 24-33
- (1977) IEEE Trans. Acoust. Speech Signal Process. , vol.25 , pp. 24-33
- Rabiner, L.R.¹

39
- 84928837806
- A joint synchrony/mean-rate model of auditory speech processing
- Seneff S. A joint synchrony/mean-rate model of auditory speech processing. J. Phonetics 16 (1988) 55-76
- (1988) J. Phonetics , vol.16 , pp. 55-76
- Seneff, S.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.