메뉴 건너뛰기




Volumn 25, Issue 1-3, 1998, Pages 3-27

Should recognizers have ears?

Author keywords

Auditory modeling; Automatic speech recognition; Human like processing; Modulation frequency

Indexed keywords

OPTIMIZATION; RANDOM PROCESSES; SENSORY PERCEPTION; SPECTRUM ANALYSIS; SPEECH;

EID: 0032139768     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/S0167-6393(98)00027-2     Document Type: Article
Times cited : (124)

References (77)
  • 2
    • 0028516073 scopus 로고
    • How do humans process and recognize speech?
    • Allen, J.B., 1994. How do humans process and recognize speech?. IEEE Trans. Speech Audio Process. 2 (4), 567-577.
    • (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.4 , pp. 567-577
    • Allen, J.B.1
  • 4
    • 84898992685 scopus 로고    scopus 로고
    • Coding of naturalistic stimuli by auditory midbrain neurons
    • Morgan Kaufmann, Los Altos, CA.
    • Attias, H., Schreiner, C.E., 1998. Coding of naturalistic stimuli by auditory midbrain neurons. In: Advances in Neural Information Processing Systems, Vol. 10. Morgan Kaufmann, Los Altos, CA.
    • (1998) Advances in Neural Information Processing Systems , vol.10
    • Attias, H.1    Schreiner, C.E.2
  • 7
    • 0020932333 scopus 로고
    • Two-formant models of vowel perception: Shortcomings and enhancements
    • Bladon, A., 1983. Two-formant models of vowel perception: Shortcomings and enhancements. Speech Communication 2, 305-313.
    • (1983) Speech Communication , vol.2 , pp. 305-313
    • Bladon, A.1
  • 11
    • 0347387977 scopus 로고
    • An experimental automatic word recognition system
    • Joint Speech Research Unit, Ruislip, England
    • Bridle, J.S., Brown, M.D., 1974. An experimental automatic word recognition system. JSRU Report No. 1003, Joint Speech Research Unit, Ruislip, England.
    • (1974) JSRU Report No. 1003 , vol.1003
    • Bridle, J.S.1    Brown, M.D.2
  • 12
    • 25044464569 scopus 로고
    • The front cavity/F2' hypothesis tested by data on tongue movements
    • Broad, D., Hermansky, H., 1989. The front cavity/F2' hypothesis tested by data on tongue movements. J. Acoust. Soc. Amer. 86 (Suppl. 1), S13-S14.
    • (1989) J. Acoust. Soc. Amer. , vol.86 , Issue.1 SUPPL.
    • Broad, D.1    Hermansky, H.2
  • 14
    • 0024392496 scopus 로고
    • Application of an auditory model to speech recognition
    • Cohen, J.R., 1989. Application of an auditory model to speech recognition. J. Acoust. Soc. Amer. 85 (6), 2623-2629.
    • (1989) J. Acoust. Soc. Amer. , vol.85 , Issue.6 , pp. 2623-2629
    • Cohen, J.R.1
  • 17
    • 0021906779 scopus 로고
    • Central auditory processing of peripheral vowel spectra
    • Chistovich, L.A., 1985. Central auditory processing of peripheral vowel spectra. J. Acoust. Soc. Amer. 77, 789-805.
    • (1985) J. Acoust. Soc. Amer. , vol.77 , pp. 789-805
    • Chistovich, L.A.1
  • 18
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis, S.B., Mermelstein, P., 1980. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28 (4), 357-366.
    • (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 19
    • 0347387973 scopus 로고    scopus 로고
    • Sound feature decomposition by the primary auditory cortex
    • Breckenridge, Colorado (submitted to Science, also unpublished technical memo)
    • deCharms, C.R., Blake, D., Merzenich, M.M., 1997. Sound feature decomposition by the primary auditory cortex. In: 1997 Workshop on Advances in Neural Information Processing, Breckenridge, Colorado (submitted to Science, also unpublished technical memo).
    • (1997) 1997 Workshop on Advances in Neural Information Processing
    • DeCharms, C.R.1    Blake, D.2    Merzenich, M.M.3
  • 20
    • 0027957839 scopus 로고
    • Effect of temporal envelope smearing on speech reception
    • Drullman, R., Festen, J.M., Plomp, R., 1994. Effect of temporal envelope smearing on speech reception. J. Acoust. Soc. Amer. 95, 1053-1064.
    • (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 1053-1064
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 21
    • 0028287770 scopus 로고
    • Effect of reducing slow temporal modulations on speech reception
    • Drullman, R., Festen, J.M., Plomp, R., 1994. Effect of reducing slow temporal modulations on speech reception. J. Acoust. Soc. Amer. 95, 2670-2680.
    • (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 2670-2680
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 22
    • 84885525095 scopus 로고
    • Auditory matching of vowels with two formant synthetic sounds
    • Speech Transmission Laboratory, Royal Institute of Technology, Stockholm
    • Fant, G., Risberg, A., 1962. Auditory matching of vowels with two formant synthetic sounds. Quarterly Progress and Status Report 4, Speech Transmission Laboratory, Royal Institute of Technology, Stockholm.
    • (1962) Quarterly Progress and Status Report , vol.4
    • Fant, G.1    Risberg, A.2
  • 23
    • 0038747568 scopus 로고
    • Acoustic description and classification of phonetic units
    • Fant, G., 1965. Acoustic description and classification of phonetic units. Ericsson Technics, No. 1, reprinted in: Fant, G., 1973. Speech Sounds and Features. MIT Press, Cambridge, MA.
    • (1965) Ericsson Technics , vol.1
    • Fant, G.1
  • 24
    • 0004110342 scopus 로고
    • reprinted MIT Press, Cambridge, MA.
    • Fant, G., 1965. Acoustic description and classification of phonetic units. Ericsson Technics, No. 1, reprinted in: Fant, G., 1973. Speech Sounds and Features. MIT Press, Cambridge, MA.
    • (1973) Speech Sounds and Features
    • Fant, G.1
  • 27
    • 0014113409 scopus 로고
    • On the second spectral peak of front vowels: A perceptual study of the role of the second and third formants
    • Fujimura, O., 1964. On the second spectral peak of front vowels: A perceptual study of the role of the second and third formants. Language and Speech 10, 181-193.
    • (1964) Language and Speech , vol.10 , pp. 181-193
    • Fujimura, O.1
  • 28
    • 0019555090 scopus 로고
    • Cepstral analysis technique for automatic speaker verification
    • Furui, S., 1981. Cepstral analysis technique for automatic speaker verification. IEEE Trans. Acoust. Speech Signal Process. 29, 254-272.
    • (1981) IEEE Trans. Acoust. Speech Signal Process. , vol.29 , pp. 254-272
    • Furui, S.1
  • 29
    • 0001942829 scopus 로고
    • Neural networks and the bias/variance dilemma
    • Geman, S., Bienenstock, E., Doursat, R., 1992. Neural networks and the bias/variance dilemma. Neural Computation 4 (1), 1-58.
    • (1992) Neural Computation , vol.4 , Issue.1 , pp. 1-58
    • Geman, S.1    Bienenstock, E.2    Doursat, R.3
  • 32
    • 0141629798 scopus 로고    scopus 로고
    • Spectral dynamics for speech recognition under adverse conditions
    • Lee, C.H., Soong, F.K., Paliwal, K.K. (Eds.), Kluwer Academic Publishers, Dordrecht
    • Hanson, B.A., Applebaum, T.H., Junqua, J.C., 1996. Spectral dynamics for speech recognition under adverse conditions. In: Lee, C.H., Soong, F.K., Paliwal, K.K. (Eds.), Automatic Speech and Speaker Recognition. Kluwer Academic Publishers, Dordrecht.
    • (1996) Automatic Speech and Speaker Recognition
    • Hanson, B.A.1    Applebaum, T.H.2    Junqua, J.C.3
  • 33
    • 0021122763 scopus 로고
    • The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence in interfering speech
    • Hanson, B., Wong, D., 1984. The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence in interfering speech. In: Proceedings of International Conference on Acoust. Speech and Signal Processing, pp. 18.A.5.1-18.A.5.4.
    • (1984) Proceedings of International Conference on Acoust. Speech and Signal Processing
    • Hanson, B.1    Wong, D.2
  • 35
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Hermansky, H., 1990. Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Amer. 87 (4), 1738-1752.
    • (1990) J. Acoust. Soc. Amer. , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 36
    • 0348018654 scopus 로고
    • Exploring temporal domain for robustness in speech recognition
    • Trondheim, Norway
    • Hermansky, H., 1995. Exploring temporal domain for robustness in speech recognition. In: Proceedings of the 15th International Congress on Acoustics, Vol. II, Trondheim, Norway, pp. 61-64.
    • (1995) Proceedings of the 15th International Congress on Acoustics , vol.2 , pp. 61-64
    • Hermansky, H.1
  • 42
    • 85135377175 scopus 로고
    • Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP)
    • Genova, Italy
    • Hermansky, H., Morgan, N., Bayya, A., Kohn, P., 1991. Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP). In: Proceedings of Eurospeech'91, Genova, Italy, pp. 1367-1371.
    • (1991) Proceedings of Eurospeech'91 , pp. 1367-1371
    • Hermansky, H.1    Morgan, N.2    Bayya, A.3    Kohn, P.4
  • 44
    • 3543081154 scopus 로고
    • Modulation spectrum in speech processing
    • Prochazka, A., Uhlir, J., Rayner, P.J.W., Kingsbury, N.G. (Eds.), Birkhauser, Boston
    • Hermansky, H., 1988. Modulation spectrum in speech processing. In: Prochazka, A., Uhlir, J., Rayner, P.J.W., Kingsbury, N.G. (Eds.), Signal Analysis and Prediction. Birkhauser, Boston.
    • (1988) Signal Analysis and Prediction
    • Hermansky, H.1
  • 45
    • 0011823639 scopus 로고
    • Improved speech recognition using high-pass filtering of subband envelopes
    • Genova, Italy
    • Hirsch, H.G., Meyer, P., Ruehl, H., 1991. Improved speech recognition using high-pass filtering of subband envelopes. In: Proceedings of Eurospeech'91, Genova, Italy, pp. 413-416.
    • (1991) Proceedings of Eurospeech'91 , pp. 413-416
    • Hirsch, H.G.1    Meyer, P.2    Ruehl, H.3
  • 46
    • 84873312246 scopus 로고
    • A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria
    • Houtgast, T., Steeneken, H.J.M., 1985. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J. Acoust. Soc. Amer. 77 (3), 1069-1077.
    • (1985) J. Acoust. Soc. Amer. , vol.77 , Issue.3 , pp. 1069-1077
    • Houtgast, T.1    Steeneken, H.J.M.2
  • 47
    • 0038133932 scopus 로고
    • A statistical approach to metrics for word and syllable recognition
    • Hunt, M.J., 1979. A statistical approach to metrics for word and syllable recognition. J. Acoust. Soc. Amer. 66 (S1), S35(A).
    • (1979) J. Acoust. Soc. Amer. , vol.66 , Issue.S1
    • Hunt, M.J.1
  • 48
    • 0024905238 scopus 로고
    • A comparison of several acoustic representations for speech recognition with degraded and undegraded speech
    • Glasgow, Scotland
    • Hunt, M., Lefebvre, C., 1989. A comparison of several acoustic representations for speech recognition with degraded and undegraded speech. In: Proceedings of International Conference on Acoust. Speech and Signal Processing, Glasgow, Scotland, pp. 262-265.
    • (1989) Proceedings of International Conference on Acoust. Speech and Signal Processing , pp. 262-265
    • Hunt, M.1    Lefebvre, C.2
  • 51
    • 0020117798 scopus 로고
    • Forward masking as a function of frequency, masker level, and signal delay
    • Jestead, W., Bacon, S.P., Lehman, J.R., 1982. Forward masking as a function of frequency, masker level, and signal delay. J. Acoust. Soc. Amer. 950-962.
    • (1982) J. Acoust. Soc. Amer. , pp. 950-962
    • Jestead, W.1    Bacon, S.P.2    Lehman, J.R.3
  • 52
    • 84883097102 scopus 로고    scopus 로고
    • On the importance of various modulation frequencies for speech recognition
    • Rhodos, Greece
    • Kanedera, N., Arai, T., Hermansky, H., Pavel, M., 1997. On the importance of various modulation frequencies for speech recognition. In: Proceedings of Eurospeech'97, Rhodos, Greece, pp. 1079-1082.
    • (1997) Proceedings of Eurospeech'97 , pp. 1079-1082
    • Kanedera, N.1    Arai, T.2    Hermansky, H.3    Pavel, M.4
  • 54
    • 0001490199 scopus 로고
    • Speech processing strategies based on auditory models
    • Carlson, R., Granstrom, B. (Eds.), Elsevier Biomedical Press, New York
    • Klatt, D.H., 1982. Speech processing strategies based on auditory models. In: Carlson, R., Granstrom, B. (Eds.), The Representation of Speech in The Peripheral Auditory System. Elsevier Biomedical Press, New York, pp. 181-202.
    • (1982) The Representation of Speech in the Peripheral Auditory System , pp. 181-202
    • Klatt, D.H.1
  • 57
    • 0018478297 scopus 로고
    • Spectral root homomorphic deconvolution system
    • Lim, J.S., 1979. Spectral root homomorphic deconvolution system. IEEE Trans. Acoust. Speech Signal Process. 27 (3), 223-233.
    • (1979) IEEE Trans. Acoust. Speech Signal Process. , vol.27 , Issue.3 , pp. 223-233
    • Lim, J.S.1
  • 58
    • 0029754956 scopus 로고
    • Accurate consonant perception without mid-frequency speech energy
    • Lippmann, R.P., 1995. Accurate consonant perception without mid-frequency speech energy. IEEE Trans. Speech and Audio 4 (1), 66-69.
    • (1995) IEEE Trans. Speech and Audio , vol.4 , Issue.1 , pp. 66-69
    • Lippmann, R.P.1
  • 60
  • 61
    • 0003834557 scopus 로고
    • Freeman, San Francisco, CA.
    • Marr, D., 1982. Vision. Freeman, San Francisco, CA.
    • (1982) Vision
    • Marr, D.1
  • 62
    • 0038133939 scopus 로고
    • Distance measures for speech recognition, psychological and instrumental
    • Chen, R.C.H. (Ed.), Academic Press, New York
    • Mermelstein, P., 1976. Distance measures for speech recognition, psychological and instrumental. In: Chen, R.C.H. (Ed.), Pattern Recognition and Artificial Intelligence. Academic Press, New York, pp. 374-388.
    • (1976) Pattern Recognition and Artificial Intelligence , pp. 374-388
    • Mermelstein, P.1
  • 65
    • 0007636578 scopus 로고
    • Temporal masking in automatic speech recognition
    • Pavel, M., Hermansky, H., 1994. Temporal masking in automatic speech recognition. J. Acoust. Soc. Amer. A 95, 2876.
    • (1994) J. Acoust. Soc. Amer. A , vol.95 , pp. 2876
    • Pavel, M.1    Hermansky, H.2
  • 66
    • 0015129120 scopus 로고
    • Real-time recognition of spoken words
    • Pols, L.C.W., 1971. Real-time recognition of spoken words. IEEE Trans. Comput. 20 (C) 972-978.
    • (1971) IEEE Trans. Comput. , vol.20 , Issue.C , pp. 972-978
    • Pols, L.C.W.1
  • 68
    • 84928837806 scopus 로고
    • A joint synchrony/mean-rate model of auditory speech processing
    • Seneff, S., 1985. A joint synchrony/mean-rate model of auditory speech processing. J. Phonetics 16 (1), 55-76.
    • (1985) J. Phonetics , vol.16 , Issue.1 , pp. 55-76
    • Seneff, S.1
  • 69
    • 0011405405 scopus 로고
    • Brightness and loudness as functions of stimulus duration
    • Stevens, J.C., Hall, J.W., 1966. Brightness and loudness as functions of stimulus duration. Perception and Psychophysics 1, 319-327.
    • (1966) Perception and Psychophysics , vol.1 , pp. 319-327
    • Stevens, J.C.1    Hall, J.W.2
  • 70
    • 0002220140 scopus 로고    scopus 로고
    • Applying phonetic knowledge to lexical access
    • Madrid, Spain
    • Stevens, K.N., 1996. Applying phonetic knowledge to lexical access. In: Proceedings of Eurospeech'95, Madrid, Spain, p. 3.
    • (1996) Proceedings of Eurospeech'95 , pp. 3
    • Stevens, K.N.1
  • 71
    • 85135190755 scopus 로고    scopus 로고
    • Multi-band and adaptation approaches to robust speech recognition
    • Rhodos, Greece
    • Tibrewala, S., Hermansky, H., 1997. Multi-band and adaptation approaches to robust speech recognition. In: Proceedings of Eurospeech'97, Rhodos, Greece, pp. 2619-2622.
    • (1997) Proceedings of Eurospeech'97 , pp. 2619-2622
    • Tibrewala, S.1    Hermansky, H.2
  • 72
    • 84947590142 scopus 로고    scopus 로고
    • Data-driven design of RASTA-like filters
    • Rhodos, Greece
    • van Vuuren, S., Hermansky, H., 1997. Data-driven design of RASTA-like filters. In: Proceedings of Eurospeech'97, Rhodos, Greece, pp. 409-412.
    • (1997) Proceedings of Eurospeech'97 , pp. 409-412
    • Van Vuuren, S.1    Hermansky, H.2
  • 74
    • 0029378080 scopus 로고
    • Spectral shape analysis in the central auditory system
    • Wang, K., Shamma, S.S., 1995. Spectral shape analysis in the central auditory system. IEEE Trans. Speech Audio Process. 3 (5), 382-394.
    • (1995) IEEE Trans. Speech Audio Process. , vol.3 , Issue.5 , pp. 382-394
    • Wang, K.1    Shamma, S.S.2
  • 75
    • 0030028881 scopus 로고    scopus 로고
    • Some effects of filtered context on the perception of vowels and fricatives
    • Watkins, A.J., Makin, S.J., 1997. Some effects of filtered context on the perception of vowels and fricatives. J. Acoust. Soc. Amer. 99 (1), 588-594.
    • (1997) J. Acoust. Soc. Amer. , vol.99 , Issue.1 , pp. 588-594
    • Watkins, A.J.1    Makin, S.J.2
  • 77
    • 0039777029 scopus 로고
    • Scaling
    • Keidel O., Neff W. (Eds.), Springer, Berlin
    • Zwicker, E., 1975. Scaling. In: Keidel O., Neff W. (Eds.), Handbook of Sensory Physiology, Vol. V.3. Springer, Berlin, pp. 401-448.
    • (1975) Handbook of Sensory Physiology , vol.3 , pp. 401-448
    • Zwicker, E.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.