메뉴 건너뛰기




Volumn , Issue , 2010, Pages 339-366

An overview of modern speech recognition

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER AIDED INSTRUCTION; HIDDEN MARKOV MODELS; REVIEWS; SPEECH; SPEECH COMMUNICATION;

EID: 85019175281     PISSN: None     EISSN: None     Source Type: Book    
DOI: None     Document Type: Chapter
Times cited : (34)

References (67)
  • 3
    • 0040856612 scopus 로고
    • Stochastic modeling for automatic speech recognition
    • D. R. Reddy, (ed.), Academic Press, New York
    • Baker, J. (1975). Stochastic modeling for automatic speech recognition, in D. R. Reddy, (ed.), Speech Recognition, Academic Press, New York.
    • (1975) Speech Recognition
    • Baker, J.1
  • 7
    • 0001862769 scopus 로고
    • An inequality and associated maximization technique occurring in statistical estimation for probabilistic functions of a Markov process
    • Baum, L. (1972). An inequality and associated maximization technique occurring in statistical estimation for probabilistic functions of a Markov process, Inequalities, III, 1-8.
    • (1972) Inequalities , vol.3 , pp. 1-8
    • Baum, L.1
  • 10
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis, S. and P. Mermelstein (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 357-366.
    • (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 12
    • 0027678649 scopus 로고
    • A stochastic model of speech incorporating hierarchical nonstationarity
    • Deng, L. (1993). A stochastic model of speech incorporating hierarchical nonstationarity, IEEE Transactions on Speech and Audio Processing, 1(4), 471-475.
    • (1993) IEEE Transactions on Speech and Audio Processing , vol.1 , Issue.4 , pp. 471-475
    • Deng, L.1
  • 13
    • 4243109553 scopus 로고    scopus 로고
    • Challenges in adopting speech recognition
    • Deng, L. and X. D. Huang (2004). Challenges in adopting speech recognition, Communications of the ACM, 47(1), 11-13.
    • (2004) Communications of the ACM , vol.47 , Issue.1 , pp. 11-13
    • Deng, L.1    Huang, X.D.2
  • 15
    • 0030190520 scopus 로고    scopus 로고
    • Transitional speech units and their representation by the regressive Markov states: Applications to speech recognition
    • Deng, L. and H. Sameti (1996). Transitional speech units and their representation by the regressive Markov states: Applications to speech recognition, IEEE Transactions on Speech and Audio Processing, 4(4), 301-306.
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.4 , pp. 301-306
    • Deng, L.1    Sameti, H.2
  • 17
    • 0028234947 scopus 로고
    • A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features
    • Deng, L. and D. Sun (1994). A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features, Journal of the Acoustical Society of America, 85(5), 2702-2719.
    • (1994) Journal of the Acoustical Society of America , vol.85 , Issue.5 , pp. 2702-2719
    • Deng, L.1    Sun, D.2
  • 18
    • 0028516022 scopus 로고
    • Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states
    • Deng, L., M. Aksmanovic, D. Sun, and J. Wu (1994). Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states, IEEE Transactions on Speech and Audio Processing, 2, 507-520.
    • (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , pp. 507-520
    • Deng, L.1    Aksmanovic, M.2    Sun, D.3    Wu, J.4
  • 19
    • 2442551863 scopus 로고    scopus 로고
    • Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features
    • Deng, L., J. Droppo, and A. Acero (2004). Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features, IEEE Transactions on Speech and Audio Processing, 12(3), 218-233.
    • (2004) IEEE Transactions on Speech and Audio Processing , vol.12 , Issue.3 , pp. 218-233
    • Deng, L.1    Droppo, J.2    Acero, A.3
  • 21
    • 84901773892 scopus 로고    scopus 로고
    • Environmental robustness
    • Springer-Verlag, Berlin, Germany
    • Droppo, J. and A. Acero (2008). Environmental robustness, in Handbook of Speech Processing, pp. 653-680, Springer-Verlag, Berlin, Germany.
    • (2008) Handbook of Speech Processing , pp. 653-680
    • Droppo, J.1    Acero, A.2
  • 23
    • 0030638031 scopus 로고    scopus 로고
    • A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)
    • Santa Barbara, CA
    • Fiscus, J. (1997). A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER), IEEE Automatic Speech Recognition and Understanding Workshop, pp. 3477-3482, Santa Barbara, CA.
    • (1997) IEEE Automatic Speech Recognition and Understanding Workshop , pp. 3477-3482
    • Fiscus, J.1
  • 26
    • 34047246149 scopus 로고    scopus 로고
    • Maximum entropy direct models for speech recognition
    • Gao Y. and J. Kuo (2006). Maximum entropy direct models for speech recognition, IEEE Transactions on Speech and Audio Processing, 14(3), 873-881.
    • (2006) IEEE Transactions on Speech and Audio Processing , vol.14 , Issue.3 , pp. 873-881
    • Gao, Y.1    Kuo, J.2
  • 27
    • 85032775863 scopus 로고    scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Gauvain, J.-L. and C.-H. Lee (1997). Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Transactions on Speech and Audio Processing, 7, 711-720.
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.7 , pp. 711-720
    • Gauvain, J.-L.1    Lee, C.-H.2
  • 30
    • 85009119467 scopus 로고    scopus 로고
    • Discriminative speaker adaptation with conditional maximum likelihood linear regression
    • Aalborg, Denmark
    • Gunawardana, A. and W. Byrne (2001). Discriminative speaker adaptation with conditional maximum likelihood linear regression, Proceedings of the EUROSPEECH, Aalborg, Denmark.
    • (2001) Proceedings of the EUROSPEECH
    • Gunawardana, A.1    Byrne, W.2
  • 32
    • 85032750905 scopus 로고    scopus 로고
    • Discriminative learning in sequential pattern recognition
    • He, X., L. Deng, C. Wu (2008). Discriminative learning in sequential pattern recognition, IEEE Signal Processing Magazine, 25(5), 14-36.
    • (2008) IEEE Signal Processing Magazine , vol.25 , Issue.5 , pp. 14-36
    • He, X.1    Deng, L.2    Wu, C.3
  • 33
    • 0025041264 scopus 로고
    • Perceptual linear predictive analysis of speech
    • Hermansky, H. (1990). Perceptual linear predictive analysis of speech, Journal of the Acoustical Society of America, 87(4), 1738-1752.
    • (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 35
    • 85054456426 scopus 로고    scopus 로고
    • Leading a start-up in an enterprise: Lessons learned in creating Microsoft response point
    • Huang, X. D. (2009). Leading a start-up in an enterprise: Lessons learned in creating Microsoft response point, IEEE Signal Processing Magazine, 26(2), 135-138.
    • (2009) IEEE Signal Processing Magazine , vol.26 , Issue.2 , pp. 135-138
    • Huang, X.D.1
  • 36
    • 0027578837 scopus 로고
    • On speaker-independent, speaker-dependent and speaker adaptive speech recognition
    • Huang, X. D. and K.-F. Lee (1993), On speaker-independent, speaker-dependent and speaker adaptive speech recognition, IEEE Transactions on Speech and Audio Processing, 1(2), 150-157.
    • (1993) IEEE Transactions on Speech and Audio Processing , vol.1 , Issue.2 , pp. 150-157
    • Huang, X.D.1    Lee, K.-F.2
  • 38
    • 0014602879 scopus 로고
    • A fast sequential decoding algorithm using a stack
    • Jelinek, F. (1969) A fast sequential decoding algorithm using a stack, IBM Journal of Research and Development, 13, 675-685.
    • (1969) IBM Journal of Research and Development , vol.13 , pp. 675-685
    • Jelinek, F.1
  • 39
    • 0016939124 scopus 로고
    • Continuous speech recognition by statistical methods
    • Jelinek, F. (1976). Continuous speech recognition by statistical methods, Proceedings of the IEEE, 64(4), 532-557.
    • (1976) Proceedings of the IEEE , vol.64 , Issue.4 , pp. 532-557
    • Jelinek, F.1
  • 43
    • 0032289099 scopus 로고    scopus 로고
    • Heteroscedastic analysis and reduced rank HMMs for improved speech recognition
    • Kumar, N. and A. Andreou (1998). Heteroscedastic analysis and reduced rank HMMs for improved speech recognition, Speech Communication, 26, 283-297.
    • (1998) Speech Communication , vol.26 , pp. 283-297
    • Kumar, N.1    Andreou, A.2
  • 46
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • Leggetter C. and P. Woodland (1995). Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Computer Speech and Language, 9, 171-185.
    • (1995) Computer Speech and Language , vol.9 , pp. 171-185
    • Leggetter, C.1    Woodland, P.2
  • 47
    • 0023331258 scopus 로고
    • An introduction to computing with neural nets
    • Lippman, R. (1987). An introduction to computing with neural nets, IEEE ASSP Magazine, 4(2), 4-22.
    • (1987) IEEE ASSP Magazine , vol.4 , Issue.2 , pp. 4-22
    • Lippman, R.1
  • 48
    • 33745208000 scopus 로고    scopus 로고
    • Investigations on error minimizing training criteria for discriminative training in automatic speech recognition
    • Lisbon, Portugal
    • Macherey, M., L. Haferkamp, R. Schlüter, and H. Ney (2005). Investigations on error minimizing training criteria for discriminative training in automatic speech recognition, in Proceedings of Interspeech, pp. 2133-2136, Lisbon, Portugal.
    • (2005) Proceedings of Interspeech , pp. 2133-2136
    • Macherey, M.1    Haferkamp, L.2    Schlüter, R.3    Ney, H.4
  • 50
    • 0021406359 scopus 로고
    • The use of a one-stage dynamic programming algorithm for connected word recognition
    • Ney, H. (1984). The use of a one-stage dynamic programming algorithm for connected word recognition, IEEE Transactions on ASSP, 32, 263-271.
    • (1984) IEEE Transactions on ASSP , vol.32 , pp. 263-271
    • Ney, H.1
  • 57
    • 0005670423 scopus 로고
    • A dynamic programming approach to continuous speech recognition
    • Budapest, Hungary
    • Sakoe, S. and S. Chiba (1971). A dynamic programming approach to continuous speech recognition, in Proceedings of the 7th International Congress on Acoustics, Vol. 3, pp. 65-69, Budapest, Hungary.
    • (1971) Proceedings of the 7th International Congress on Acoustics , vol.3 , pp. 65-69
    • Sakoe, S.1    Chiba, S.2
  • 58
    • 0010727514 scopus 로고
    • Speech discrimination by dynamic programming
    • Vintsyuk, T. (1968). Speech discrimination by dynamic programming, Kibernetika, 4(2), 81-88.
    • (1968) Kibernetika , vol.4 , Issue.2 , pp. 81-88
    • Vintsyuk, T.1
  • 59
    • 84935113569 scopus 로고
    • Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
    • Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, in IEEE Transactions on Information Theory, IT-13(2), 260-269.
    • (1967) IEEE Transactions on Information Theory , vol.13 IT , Issue.2 , pp. 260-269
    • Viterbi, A.1
  • 61
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition
    • Sun, J. and L. Deng (2002). An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition, Journal of the Acoustical Society of America, 111(2), 1086-1101.
    • (2002) Journal of the Acoustical Society of America , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.1    Deng, L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.