메뉴 건너뛰기




Volumn 18, Issue 3, 1996, Pages 205-231

Towards increasing speech recognition error rates

Author keywords

[No Author keywords available]

Indexed keywords

AUDITION; DECODING; ERROR CORRECTION; FEATURE EXTRACTION; SPEECH PROCESSING;

EID: 0030142722     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/0167-6393(96)00003-9     Document Type: Article
Times cited : (78)

References (106)
  • 2
    • 0013864705 scopus 로고
    • Click-evoked response patterns of single units in the medial geniculate body of the cat
    • L. Aitkin, C. Dunlop and W. Webster (1966), "Click-evoked response patterns of single units in the medial geniculate body of the cat", J. Neurophysiology, Vol. 29, pp. 109-123.
    • (1966) J. Neurophysiology , vol.29 , pp. 109-123
    • Aitkin, L.1    Dunlop, C.2    Webster, W.3
  • 4
    • 0028516073 scopus 로고
    • How do humans process and recognize speech?
    • J.B. Allen (1994), "How do humans process and recognize speech?", IEEE Trans. Speech Audio Process., Vol. 2, No. 4, pp. 567-577.
    • (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.4 , pp. 567-577
    • Allen, J.B.1
  • 8
    • 0012267743 scopus 로고
    • Improving state-of-the-art continuous speech recognition systems using the N-best paradigm with neural networks
    • Morgan Kaufmann, Los Altos, CA
    • S. Austin, G. Zavaliagkos, J. Makhoul and J. Schwartz (1992), "Improving state-of-the-art continuous speech recognition systems using the N-best paradigm with neural networks", Proc. DARPA Speech and Natural Language Workshop, Harriman, NY (Morgan Kaufmann, Los Altos, CA), pp. 180-184.
    • (1992) Proc. DARPA Speech and Natural Language Workshop, Harriman, Ny , pp. 180-184
    • Austin, S.1    Zavaliagkos, G.2    Makhoul, J.3    Schwartz, J.4
  • 10
  • 12
    • 0000971250 scopus 로고
    • An input output HMM architecture
    • ed. by G. Tesauro, D. Touretzky and T. Leen, MIT Press, Cambridge, MA
    • Y. Bengio and P. Frasconi (1995), "An input output HMM architecture", in Advances in Neural Information Processing Systems, ed. by G. Tesauro, D. Touretzky and T. Leen, Vol. 7 (MIT Press, Cambridge, MA).
    • (1995) Advances in Neural Information Processing Systems , vol.7
    • Bengio, Y.1    Frasconi, P.2
  • 13
    • 30244436756 scopus 로고
    • Personal communication
    • A. Bounds (1995), Personal communication.
    • (1995)
    • Bounds, A.1
  • 14
    • 0025547193 scopus 로고
    • Links between Markov models and multilayer perceptrons
    • H. Bourlard and C.J. Wellekens (1990), "Links between Markov models and multilayer perceptrons", IEEE Trans. Pattern Anal. Machine Intell., Vol. 12, No. 12, pp. 1167-1178.
    • (1990) IEEE Trans. Pattern Anal. Machine Intell. , vol.12 , Issue.12 , pp. 1167-1178
    • Bourlard, H.1    Wellekens, C.J.2
  • 16
    • 0343975592 scopus 로고
    • REMAP: Recursive estimation and maximization of a posteriori probabilities - Application to transition-based connectionist speech recognition
    • Internat. Computer Science Institute, CA
    • H. Bourlard, Y. Konig and N. Morgan (1994), REMAP: Recursive estimation and maximization of a posteriori probabilities - Application to transition-based connectionist speech recognition, ICSI Technical Report TR94-064, Internat. Computer Science Institute, CA.
    • (1994) ICSI Technical Report TR94-064
    • Bourlard, H.1    Konig, Y.2    Morgan, N.3
  • 17
    • 85102488792 scopus 로고
    • REMAP: Recursive estimation and maximization of a posteriori probabilities in connectionist speech recognition
    • H. Bourlard, Y. Konig and N. Morgan (1995), "REMAP: recursive estimation and maximization of a posteriori probabilities in connectionist speech recognition", Proc. Eurospeech '95, Madrid, Spain.
    • (1995) Proc. Eurospeech '95, Madrid, Spain
    • Bourlard, H.1    Konig, Y.2    Morgan, N.3
  • 18
    • 0347387977 scopus 로고
    • An experimental automatic word recognition system
    • Ruislip, England: Joint Speech Research Unit
    • J.S. Bridle and M.D. Brown (1974), An experimental automatic word recognition system, JSRU Report No. 1003, Ruislip, England: Joint Speech Research Unit.
    • (1974) JSRU Report No. 1003 , vol.1003
    • Bridle, J.S.1    Brown, M.D.2
  • 19
    • 30244499021 scopus 로고
    • Personal communication
    • J.S. Bridle (1995), Personal communication.
    • (1995)
    • Bridle, J.S.1
  • 21
    • 0024392496 scopus 로고
    • Application of an auditory model to speech recognition
    • J.R. Cohen (1989), "Application of an auditory model to speech recognition", J. Acoust. Soc. Amer., Vol. 85, No. 6, pp. 2623-2629.
    • (1989) J. Acoust. Soc. Amer. , vol.85 , Issue.6 , pp. 2623-2629
    • Cohen, J.R.1
  • 22
    • 30244466999 scopus 로고
    • Informal communication
    • J.R. Cohen (1995), Informal communication.
    • (1995)
    • Cohen, J.R.1
  • 24
    • 0021906779 scopus 로고
    • Central auditory processing of peripheral vowel spectra
    • L.A. Chistovich (1985), "Central auditory processing of peripheral vowel spectra", J. Acoust. Soc. Amer., Vol. 77, pp. 789-805.
    • (1985) J. Acoust. Soc. Amer. , vol.77 , pp. 789-805
    • Chistovich, L.A.1
  • 25
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • S.B. Davis and P. Mermelstein (1980), "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences", IEEE Trans. Acoust. Speech Signal Process., Vol. 28, No. 4, pp. 357-366.
    • (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 26
    • 30244481757 scopus 로고
    • Incorporating the time correlation between successive observations in an acoustic-phonetic hidden Markov model for continuous speech recognition
    • P. de La Noue, S. Levinson and M. Sondhi (1989), Incorporating the time correlation between successive observations in an acoustic-phonetic hidden Markov model for continuous speech recognition, AT&T Technical Memorandum No. 11226.
    • (1989) AT&T Technical Memorandum No. 11226
    • De La Noue, P.1    Levinson, S.2    Sondhi, M.3
  • 28
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • A.P. Dempster, N.M. Laird and D.B. Rubin (1977), "Maximum likelihood from incomplete data via the EM algorithm", J. Roy. Statist. Soc., Vol. 39, pp. 1-38.
    • (1977) J. Roy. Statist. Soc. , vol.39 , pp. 1-38
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 29
    • 0028516022 scopus 로고
    • Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states
    • L. Deng, M. Aksmanovic, X. Sun and C. Wu (1994), "Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states", IEEE Trans. Speech Audio Process., Vol. 2, No. 4, pp. 507-520.
    • (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.4 , pp. 507-520
    • Deng, L.1    Aksmanovic, M.2    Sun, X.3    Wu, C.4
  • 32
    • 0003772896 scopus 로고
    • Effects of emphasizing transitional or stationary parts of the speech signal in a discrete utterance recognition system
    • K. Elenius and M. Blomberg (1982), "Effects of emphasizing transitional or stationary parts of the speech signal in a discrete utterance recognition system", Proc. IEEE Internat. Conf. Acoust. Speech Signal Process., Paris, France, pp. 535-537.
    • (1982) Proc. IEEE Internat. Conf. Acoust. Speech Signal Process., Paris, France , pp. 535-537
    • Elenius, K.1    Blomberg, M.2
  • 36
    • 0019555090 scopus 로고
    • Cepstral analysis technique for automatic speaker verification
    • S. Furui (1981), "Cepstral analysis technique for automatic speaker verification", IEEE Trans. Acoust. Speech Signal Process., Vol. 29, pp. 254-272.
    • (1981) IEEE Trans. Acoust. Speech Signal Process. , vol.29 , pp. 254-272
    • Furui, S.1
  • 37
    • 0022667694 scopus 로고
    • Speaker independent isolated word recognizer using dynamic features of speech spectrum
    • S. Furui (1986), "Speaker independent isolated word recognizer using dynamic features of speech spectrum", IEEE Trans. Acoust. Speech Signal Process., Vol. 34, No. 1, pp. 52-59.
    • (1986) IEEE Trans. Acoust. Speech Signal Process. , vol.34 , Issue.1 , pp. 52-59
    • Furui, S.1
  • 38
    • 0027578207 scopus 로고
    • Hidden Markov-models with templates as non-stationary states: An application to speech recognition
    • O. Ghitza and M.M. Sondhi (1993), "Hidden Markov-models with templates as non-stationary states: An application to speech recognition", Computer Speech and Language, Vol. 2, pp. 101-119.
    • (1993) Computer Speech and Language , vol.2 , pp. 101-119
    • Ghitza, O.1    Sondhi, M.M.2
  • 40
    • 0001373576 scopus 로고
    • Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables
    • I.J. Good (1963), "Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables", Ann. Math. Statist., Vol. 34, pp. 911-934.
    • (1963) Ann. Math. Statist. , vol.34 , pp. 911-934
    • Good, I.J.1
  • 42
    • 28844440746 scopus 로고
    • The representation of speech in the auditory periphery
    • S. Greenberg (1988), "The representation of speech in the auditory periphery", J. Phonetics, Vol. 16, pp. 1-151.
    • (1988) J. Phonetics , vol.16 , pp. 1-151
    • Greenberg, S.1
  • 47
    • 0021122763 scopus 로고
    • The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence in interfering speech
    • B. Hanson and D. Wong (1984), "The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence in interfering speech", Proc. IEEE Internat. Conf. Acoust. Speech Signal Process., pp. 18.A.5.1-18.A.5.4.
    • (1984) Proc. IEEE Internat. Conf. Acoust. Speech Signal Process.
    • Hanson, B.1    Wong, D.2
  • 48
    • 0023167028 scopus 로고
    • An efficient speaker-independent automatic speech recognition by simulation of some properties of human auditory perception
    • H. Hermansky (1987), "An efficient speaker-independent automatic speech recognition by simulation of some properties of human auditory perception", Proc. IEEE Internat. Conf. Acoust. Speech Signal Process., Dallas, TX, pp. 1159-1162.
    • (1987) Proc. IEEE Internat. Conf. Acoust. Speech Signal Process., Dallas, TX , pp. 1159-1162
    • Hermansky, H.1
  • 49
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • H. Hermansky (1990), "Perceptual linear predictive (PLP) analysis of speech", J. Acoust. Soc. Amer., Vol. 87, No. 4, pp. 1738-1752.
    • (1990) J. Acoust. Soc. Amer. , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 55
    • 0024905238 scopus 로고
    • A comparison of several acoustic representations for speech recognition with degraded and undegraded speech
    • M. Hunt and C. Lefebvre (1989), "A comparison of several acoustic representations for speech recognition with degraded and undegraded speech", Internat. Conf. Acoust. Speech Signal Process., Glasgow, Scotland, pp. 262-265.
    • (1989) Internat. Conf. Acoust. Speech Signal Process., Glasgow, Scotland , pp. 262-265
    • Hunt, M.1    Lefebvre, C.2
  • 58
    • 0016507833 scopus 로고
    • Design of a linguistic statistical decoder for the recognition of continuous speech
    • F. Jelinek, L.R. Bahl and R.L. Mercer (1975), "Design of a linguistic statistical decoder for the recognition of continuous speech", IEEE Trans. Information Theory, Vol. IT-21, pp. 250-256.
    • (1975) IEEE Trans. Information Theory , vol.IT-21 , pp. 250-256
    • Jelinek, F.1    Bahl, L.R.2    Mercer, R.L.3
  • 59
    • 0016939124 scopus 로고
    • Continuous speech recognition by statistical methods
    • F. Jelinek (1976), "Continuous speech recognition by statistical methods", IEEE Proc., Vol. 64, No. 4, pp. 532-556.
    • (1976) IEEE Proc. , vol.64 , Issue.4 , pp. 532-556
    • Jelinek, F.1
  • 61
    • 0000262562 scopus 로고
    • Hierarchical mixtures of experts and the EM algorithm
    • M.I. Jordan and R.A. Jacobs (1994), "Hierarchical mixtures of experts and the EM algorithm", Neural Computation, Vol. 6, pp. 181-214.
    • (1994) Neural Computation , vol.6 , pp. 181-214
    • Jordan, M.I.1    Jacobs, R.A.2
  • 62
    • 0022270364 scopus 로고
    • Mixture autoregressive hidden Markov models for speech signals
    • B.H. Juang and L.R. Rabiner (1985), "Mixture autoregressive hidden Markov models for speech signals", IEEE Trans. Acoust. Speech Signal Process., Vol. 33, No. 6, pp. 1404-14013.
    • (1985) IEEE Trans. Acoust. Speech Signal Process. , vol.33 , Issue.6 , pp. 1404-14013
    • Juang, B.H.1    Rabiner, L.R.2
  • 64
    • 0026271562 scopus 로고
    • New discriminative training algorithms based on the generalized probabilistic descent method
    • edited by B.H. Juang, S.Y. Kung and C.A. Kamm (Morgan Kauffman, Los Altos, CA)
    • S. Katagiri, C.H. Lee and B.H. Juang (1991), "New discriminative training algorithms based on the generalized probabilistic descent method", in Proc. IEEE Workshop on Neural Networks for Signal Process., edited by B.H. Juang, S.Y. Kung and C.A. Kamm (Morgan Kauffman, Los Altos, CA), pp. 299-308.
    • (1991) Proc. IEEE Workshop on Neural Networks for Signal Process. , pp. 299-308
    • Katagiri, S.1    Lee, C.H.2    Juang, B.H.3
  • 65
    • 0001490199 scopus 로고
    • Speech processing strategies based on auditory models
    • ed. by R. Carlson and B. Granstrom (Elsevier - Biomedical Press, New York)
    • D.H. Klatt (1982), "Speech processing strategies based on auditory models", in The Representation of Speech in the Peripheral Auditory System, ed. by R. Carlson and B. Granstrom (Elsevier - Biomedical Press, New York), pp. 181-202.
    • (1982) The Representation of Speech in the Peripheral Auditory System , pp. 181-202
    • Klatt, D.H.1
  • 66
    • 0026142334 scopus 로고
    • A study on speaker adaptation of the parameters of continuous density hidden Markov models
    • C-H. Lee, C-H. Lin and B-H. Juang (1991), "A study on speaker adaptation of the parameters of continuous density hidden Markov models", IEEE Trans. Signal Process., Vol. 39, No. 4, pp. 806-814.
    • (1991) IEEE Trans. Signal Process. , vol.39 , Issue.4 , pp. 806-814
    • Lee, C.-H.1    Lin, C.-H.2    Juang, B.-H.3
  • 67
    • 0027269171 scopus 로고
    • Hidden control neural architecture modeling of nonlinear time varying systems and its applications
    • E. Levin (1993), "Hidden control neural architecture modeling of nonlinear time varying systems and its applications", IEEE Trans. Neural Networks, Vol. 4, No. 1, pp. 109-116.
    • (1993) IEEE Trans. Neural Networks , vol.4 , Issue.1 , pp. 109-116
    • Levin, E.1
  • 68
    • 0018478297 scopus 로고
    • Spectral root homomorphic deconvolution system
    • J.S. Lim (1979), "Spectral root homomorphic deconvolution system", IEEE Trans. Acoust. Speech Signal Process., Vol. 27, No. 3, pp. 223-233.
    • (1979) IEEE Trans. Acoust. Speech Signal Process. , vol.27 , Issue.3 , pp. 223-233
    • Lim, J.S.1
  • 69
    • 0020180460 scopus 로고
    • Maximum likelihood estimation for multivariate observations of Markov sources
    • L.A. Liporace (1982), "Maximum likelihood estimation for multivariate observations of Markov sources", IEEE Trans. Information Theory, Vol. IT-28, No. 5, pp. 729-734.
    • (1982) IEEE Trans. Information Theory , vol.IT-28 , Issue.5 , pp. 729-734
    • Liporace, L.A.1
  • 72
    • 0038133939 scopus 로고
    • Distance measures for speech recognition, psychological and instrumental
    • ed. by R.C.H. Chen (Academic Press, New York)
    • P. Mermelstein (1976), "Distance measures for speech recognition, psychological and instrumental", in Pattern Recognition and Artificial Intelligence, ed. by R.C.H. Chen (Academic Press, New York), pp. 374-388.
    • (1976) Pattern Recognition and Artificial Intelligence , pp. 374-388
    • Mermelstein, P.1
  • 74
    • 0029308753 scopus 로고
    • Neural networks for statistical recognition of continuous speech
    • N. Morgan and H. Bourlard (1995), "Neural networks for statistical recognition of continuous speech", Proc. IEEE, Vol. 83, No. 5, pp. 741-770.
    • (1995) Proc. IEEE , vol.83 , Issue.5 , pp. 741-770
    • Morgan, N.1    Bourlard, H.2
  • 80
    • 0007636578 scopus 로고
    • Temporal masking in automatic speech recognition
    • M. Pavel and H. Hermansky (1994), "Temporal masking in automatic speech recognition", J. Acoust. Soc. Amer., Vol. 95, No. 5, pp. 2876.
    • (1994) J. Acoust. Soc. Amer. , vol.95 , Issue.5 , pp. 2876
    • Pavel, M.1    Hermansky, H.2
  • 81
  • 82
    • 0015129120 scopus 로고
    • Real-time recognition of spoken words
    • L.C.W. Pols (1971), "Real-time recognition of spoken words", IEEE Trans. Computers, Vol. 20(C), pp. 972-978.
    • (1971) IEEE Trans. Computers , vol.20 , Issue.C , pp. 972-978
    • Pols, L.C.W.1
  • 86
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • L.R. Rabiner (1989), "A tutorial on hidden Markov models and selected applications in speech recognition", Proc. IEEE, Vol. 77, No. 2, pp. 257-285.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-285
    • Rabiner, L.R.1
  • 88
    • 0001595997 scopus 로고
    • Neural network classifiers estimate Bayesian a posteriori probabilities
    • M.D. Richard and R.P. Lippmann (1991), "Neural network classifiers estimate Bayesian a posteriori probabilities", Neural Computation, Vol. 3, pp. 461-483.
    • (1991) Neural Computation , vol.3 , pp. 461-483
    • Richard, M.D.1    Lippmann, R.P.2
  • 90
    • 0000329355 scopus 로고
    • A recurrent error propagation network speech recognition system
    • T. Robinson and F. Fallside (1991), "A recurrent error propagation network speech recognition system", Computer Speech and Language, Vol. 5, pp. 259-274.
    • (1991) Computer Speech and Language , vol.5 , pp. 259-274
    • Robinson, T.1    Fallside, F.2
  • 94
    • 84928837806 scopus 로고
    • A joint synchrony/mean-rate model of auditory speech processing
    • S. Seneff (1985), "A joint synchrony/mean-rate model of auditory speech processing", J. Phonetics, Vol. 16, No. 1, pp.55-76.
    • (1985) J. Phonetics , vol.16 , Issue.1 , pp. 55-76
    • Seneff, S.1
  • 96
    • 0039670390 scopus 로고
    • Multilingual assessment of speaker independent large vocabulary speech-recognition systems: The SQALE project (speech recognition quality assessment for language engineering)
    • J.M. Steeneken and D.A. Van Leeuwen (1995), "Multilingual assessment of speaker independent large vocabulary speech-recognition systems: The SQALE project (speech recognition quality assessment for language engineering)", Proc. Eurospeech'95, Madrid, Spain, pp. 1271-1274.
    • (1995) Proc. Eurospeech'95, Madrid, Spain , pp. 1271-1274
    • Steeneken, J.M.1    Van Leeuwen, D.A.2
  • 97
    • 34447546202 scopus 로고
    • On the psychophysical law
    • S.S. Stevens (1957), "On the psychophysical law", Psychol. Rev., Vol. 64, No. 1, pp. 153-181.
    • (1957) Psychol. Rev. , vol.64 , Issue.1 , pp. 153-181
    • Stevens, S.S.1
  • 98
    • 0011405405 scopus 로고
    • Brightness and loudness as functions of stimulus duration
    • J.C. Stevens and J.W. Hall (1966), "Brightness and loudness as functions of stimulus duration", Perception and Psychophysics, pp. 319-327.
    • (1966) Perception and Psychophysics , pp. 319-327
    • Stevens, J.C.1    Hall, J.W.2
  • 99
    • 0016495712 scopus 로고
    • Blind deconvolution through digital signal processing
    • T. Stockham, T. Cannon and R. Ingerbretsen (1975), "Blind deconvolution through digital signal processing", Proc. IEEE, Vol. 63, pp. 678-692.
    • (1975) Proc. IEEE , vol.63 , pp. 678-692
    • Stockham, T.1    Cannon, T.2    Ingerbretsen, R.3
  • 100
    • 0018195604 scopus 로고
    • Memory and time improvements in a dynamic programming algorithm for matching speech patterns
    • C.C. Tappert and S.K. Das (1978), "Memory and time improvements in a dynamic programming algorithm for matching speech patterns", IEEE Trans. Acoust. Speech Signal Process., Vol. 26, pp. 583-586.
    • (1978) IEEE Trans. Acoust. Speech Signal Process. , vol.26 , pp. 583-586
    • Tappert, C.C.1    Das, S.K.2
  • 104
    • 0039777029 scopus 로고
    • Scaling
    • ed. by Keidel and Neff (Springer, Berlin)
    • E. Zwicker (1975), "Scaling", in Handbook of Sensory Physiology, ed. by Keidel and Neff (Springer, Berlin, Vol. 3), pp. 401-448.
    • (1975) Handbook of Sensory Physiology , vol.3 , pp. 401-448
    • Zwicker, E.1
  • 105
    • 0022151324 scopus 로고
    • The use of speech knowledge in automatic speech recognition
    • V. Zue (1985), "The use of speech knowledge in automatic speech recognition", Proc. IEEE, Vol. 73, No. 11, pp. 1602-1615,
    • (1985) Proc. IEEE , vol.73 , Issue.11 , pp. 1602-1615
    • Zue, V.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.