메뉴 건너뛰기




Volumn 53, Issue 5, 2011, Pages 753-767

Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition

Author keywords

Automatic speech recognition; Intrinsic variability; Robustness; Spectro temporal feature extraction

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; BIO-INSPIRED; COMBINED INFORMATIONS; DIGIT CLASSIFICATION; EXTRINSIC FACTORS; FEATURE TYPES; GABOR FEATURE; INTRINSIC VARIABILITIES; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; NOISE SOURCE; PHONEME RECOGNITION; ROBUSTNESS; SPEAKING RATE; SPEAKING STYLES; SPECTRAL FEATURE; SPECTRO-TEMPORAL FEATURE EXTRACTION; TEMPORAL FEATURES; TEMPORAL PROCESSING; TEST CONDITION; TRANSMISSION CHANNELS; WORD ERROR RATE;

EID: 79953659090     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2010.07.002     Document Type: Article
Times cited : (52)

References (36)
  • 1
    • 34247568840 scopus 로고    scopus 로고
    • Modelling speaker intelligibility in noise
    • J. Barker, and M. Cooke Modelling speaker intelligibility in noise Speech Comm. 2007
    • (2007) Speech Comm.
    • Barker, J.1    Cooke, M.2
  • 2
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • S. Davis, and P. Mermelstein Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Trans. Acoust. Speech Signal Process. 28 1980 357 366 (Pubitemid 11464930)
    • (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
    • Davis Steven, B.1    Mermelstein Paul2
  • 4
    • 0035097825 scopus 로고    scopus 로고
    • Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex
    • D. Depireux, J. Simon, D. Klein, and S. Shamma Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex J. Neurophysiol. 85 2001 1220 1234 (Pubitemid 32209608)
    • (2001) Journal of Neurophysiology , vol.85 , Issue.3 , pp. 1220-1234
    • Depireux, D.A.1    Simon, J.Z.2    Klein, D.J.3    Shamma, S.A.4
  • 5
  • 6
    • 79953646083 scopus 로고    scopus 로고
    • Artificial noise signals with speechlike spectral and temporal properties for hearing instrument assessment
    • W.A.H.V. Dreschler, C. Ludvigson, and S. Westermann Artificial noise signals with speechlike spectral and temporal properties for hearing instrument assessment J. Acoust. Soc. Amer. 105 1999 1296
    • (1999) J. Acoust. Soc. Amer. , vol.105 , pp. 1296
    • Dreschler, W.A.H.V.1    Ludvigson, C.2    Westermann, S.3
  • 7
    • 0034920512 scopus 로고    scopus 로고
    • ICRA noises: Artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment
    • W.A.H.V. Dreschler, C. Ludvigson, and S. Westermann ICRA noises: artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment Audiology 40 2001 148 157 (Pubitemid 32674146)
    • (2001) Audiology , vol.40 , Issue.3 , pp. 148-157
    • Dreschler, W.A.1    Verschuure, H.2    Ludvigsen, C.3    Westermann, S.4
  • 9
    • 67651044226 scopus 로고    scopus 로고
    • Spectro-temporal analysis of speech using 2-D Gabor filters
    • Ezzat, T., Bouvrie, J., Poggio, T., 2007. Spectro-temporal analysis of speech using 2-D Gabor filters. In: Proc. Interspeech.
    • (2007) Proc. Interspeech
    • Ezzat, T.1    Bouvrie, J.2    Poggio, T.3
  • 10
    • 0026368274 scopus 로고
    • Fast algorithms to find invariant features for a word recognizing neural net
    • Gramss, T., 1991. Fast algorithms to find invariant features for a word recognizing neural net. In: Proc. IEEE 2nd Internat. Conf. Artificial Neural Networks, pp. 180-184.
    • (1991) Proc. IEEE 2nd Internat. Conf. Artificial Neural Networks , pp. 180-184
    • Gramss, T.1
  • 11
    • 0025383284 scopus 로고
    • Recognition of isolated words based on psychoacoustics and neurobiology
    • DOI 10.1016/0167-6393(90)90043-9
    • T. Gramss, and Strube Recognition of isolated words based on psychoacoustics and neurobiology Speech Comm. 9 1990 35 40 (Pubitemid 20717751)
    • (1990) Speech Communication , vol.9 , Issue.1 , pp. 35-40
    • Gramss Tino1    Strube Hans Werner2
  • 12
    • 84867196898 scopus 로고    scopus 로고
    • Predictability of STRFs in auditory cortex neurons depends on stimulus class
    • Happel, M., Müller, S., Anemueller, J., Ohl, F., 2008. Predictability of STRFs in auditory cortex neurons depends on stimulus class. In: Proc. Interspeech, p. 670.
    • (2008) Proc. Interspeech , pp. 670
    • Happel, M.1
  • 13
    • 79953669371 scopus 로고    scopus 로고
    • A closer look on hierarchical spectro-temporal features (HIST)
    • Heckmann, M., Domont, X., Joublin, F., Goerick, C., 2008. A closer look on hierarchical spectro-temporal features (HIST). In: Proc. Interspeech, pp. 4417-4420.
    • (2008) Proc. Interspeech , pp. 4417-4420
    • Heckmann, M.1    Domont, X.2    Joublin, F.3    Goerick, C.4
  • 14
    • 0032139768 scopus 로고    scopus 로고
    • Should recognizers have ears
    • H. Hermansky Should recognizers have ears Speech Comm. 25 1998 3 24
    • (1998) Speech Comm. , vol.25 , pp. 3-24
    • Hermansky, H.1
  • 15
    • 0033709098 scopus 로고    scopus 로고
    • Tandem connectionist feature extraction for conventional HMM systems
    • Hermansky, H., Ellis, D., Sharma, S., 2000. Tandem connectionist feature extraction for conventional HMM systems. In: Proc. ICASSP, pp. 1635-1638.
    • (2000) Proc. ICASSP , pp. 1635-1638
    • Hermansky, H.1    Ellis, D.2    Sharma, S.3
  • 17
    • 0032658253 scopus 로고    scopus 로고
    • Temporal patterns (TRAPS) in ASR of noisy speech
    • Hermansky, H., Sharma, S., 1999. Temporal patterns (TRAPS) in ASR of noisy speech. In: Proc. ICASSP, pp. 289-292.
    • (1999) Proc. ICASSP , pp. 289-292
    • Hermansky, H.1    Sharma, S.2
  • 18
    • 0038669544 scopus 로고    scopus 로고
    • The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions
    • Hirsch, H., Pearce, D., 2000. The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions. In: Proc. ISCA ITRW ASR, pp. 2697-2702.
    • (2000) Proc. ISCA ITRW ASR , pp. 2697-2702
    • Hirsch, H.1    Pearce, D.2
  • 19
    • 0027465491 scopus 로고
    • The Lombard reflex and its role on human listeners and automatic speech recognizers
    • J.-C. Junqua The Lombard reflex and its role on human listeners and automatic speech recognizers J. Acoust. Soc. Amer. 93 1993 510 524 (Pubitemid 23117065)
    • (1993) Journal of the Acoustical Society of America , vol.93 , Issue.1 , pp. 510-524
    • Junqua, J.-C.1
  • 21
    • 14244272507 scopus 로고    scopus 로고
    • Methods for capturing spectro-temporal modulations in automatic speech recognition
    • M. Kleinschmidt Methods for capturing spectro-temporal modulations in automatic speech recognition Acta Acoust. 2002 416 422 (Pubitemid 34732124)
    • (2002) Acta Acustica united with Acustica , vol.88 , Issue.3 , pp. 416-422
    • Kleinschmidt, M.1
  • 23
    • 85009233038 scopus 로고    scopus 로고
    • Improving word accuracy with Gabor feature extraction
    • Kleinschmidt, M., Gelbart, D., 2002. Improving word accuracy with Gabor feature extraction. In: Proc. ICSLP.
    • (2002) Proc. ICSLP
    • Kleinschmidt, M.1    Gelbart, D.2
  • 24
    • 0002560960 scopus 로고
    • A database for speaker-independent digit recognition
    • Leonard, R., 1984. A database for speaker-independent digit recognition. In: Proc. ICASSP, pp. 328-331 (vol. IX).
    • (1984) Proc. ICASSP , vol.9 , pp. 328-331
    • Leonard, R.1
  • 25
    • 85009252964 scopus 로고    scopus 로고
    • Progress with the Philips continuous ASR system on the Aurora 2 noisy digits database
    • Lieb, M., Fischer, A., 2002. Progress with the Philips continuous ASR system on the Aurora 2 noisy digits database. In: Proc. ICSLP, pp. 449-452.
    • (2002) Proc. ICSLP , pp. 449-452
    • Lieb, M.1    Fischer, A.2
  • 26
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • PII S0167639397000216
    • R. Lippmann Speech recognition by machines and humans Speech Comm. 22 1997 1 15 (Pubitemid 127403436)
    • (1997) Speech Communication , vol.22 , Issue.1 , pp. 1-15
    • Lippmann, R.P.1
  • 27
    • 34547509128 scopus 로고    scopus 로고
    • Representation of phonemes in primary auditory cortex: How the brain analyzes speech
    • Mesgarani, N., David, S., Shamma, S., 2007. Representation of phonemes in primary auditory cortex: how the brain analyzes speech. In: Proc. Interspeech.
    • (2007) Proc. Interspeech
    • Mesgarani, N.1    David, S.2    Shamma, S.3
  • 28
    • 84867224940 scopus 로고    scopus 로고
    • Optimization and evaluation of Gabor feature sets for ASR
    • Meyer, B., Kollmeier, B., 2008. Optimization and evaluation of Gabor feature sets for ASR. In: Proc. Interspeech.
    • (2008) Proc. Interspeech
    • Meyer, B.1    Kollmeier, B.2
  • 29
    • 45549092825 scopus 로고    scopus 로고
    • Phoneme confusions in human and automatic speech recognition
    • Meyer, B., Waechter, M., Brand, T., Kollmeier, B., 2007. Phoneme confusions in human and automatic speech recognition. In: Proc. Interspeech, pp. 1485-1488.
    • (2007) Proc. Interspeech , pp. 1485-1488
    • Meyer, B.1    Waechter, M.2    Brand, T.3    Kollmeier, B.4
  • 30
    • 56149102452 scopus 로고    scopus 로고
    • A human-machine comparison in speech recognition based on a logatome corpus
    • Meyer, B., Wesker, T., 2006. A human-machine comparison in speech recognition based on a logatome corpus. In: Workshop on Speech-intrinsic Variation, pp. 95-100.
    • (2006) Workshop on Speech-intrinsic Variation , pp. 95-100
    • Meyer, B.1    Wesker, T.2
  • 31
    • 0037824480 scopus 로고    scopus 로고
    • Gabor analysis of auditory midbrain receptive fields: Spectro-temporal and binaural composition
    • A. Qiu, C. Schreiner, and M. Escabi Gabor analysis of auditory midbrain receptive fields: spectro-temporal and binaural composition J. Neurophysiol. 90 2003 456 476 (Pubitemid 36859193)
    • (2003) Journal of Neurophysiology , vol.90 , Issue.1 , pp. 456-476
    • Qiu, A.1    Schreiner, C.E.2    Escabi, M.A.3
  • 32
    • 34247580087 scopus 로고    scopus 로고
    • Reaching over the gap: A review of efforts to link human and automatic speech recognition research
    • DOI 10.1016/j.specom.2007.01.009, PII S0167639307000106, Bridging the Gap between Human and Automatic Speech Recognition
    • O. Scharenborg Reaching over the gap: a review of efforts to link human and automatic speech recognition research Speech Comm. 49 2007 336 347 (Pubitemid 46670364)
    • (2007) Speech Communication , vol.49 , Issue.5 , pp. 336-347
    • Scharenborg, O.1
  • 33
    • 0032828464 scopus 로고    scopus 로고
    • A model of auditory perception as front end for automatic speech recognition
    • J. Tchorz, and B. Kollmeier A model of auditory perception as front end for automatic speech recognition J. Acoust. Soc. Amer. 106 1999 2040
    • (1999) J. Acoust. Soc. Amer. , vol.106 , pp. 2040
    • Tchorz, J.1    Kollmeier, B.2
  • 36
    • 84867220821 scopus 로고    scopus 로고
    • Multi-stream spectro-temporal features for robust speech recognition
    • Zhao, S., Morgan, N., 2008. Multi-stream spectro-temporal features for robust speech recognition. In: Proc. Interspeech, pp. 898-901.
    • (2008) Proc. Interspeech , pp. 898-901
    • Zhao, S.1    Morgan, N.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.