메뉴 건너뛰기




Volumn 15, Issue 1, 2007, Pages 246-256

Speech recognition using linear dynamic models

Author keywords

Automatic speech recognition (ASR); Linear dynamic models (LDMs); Stack decoding

Indexed keywords

ASYNCHRONOUS DECODING; AUTOMATIC SPEECH RECOGNITION (ASR); AUTOMATIC SPEECH RECOGNITION SYSTEMS; COVARIANCE MATRICES; DERIVATIVE INFORMATIONS; DYNAMIC STATE; FEATURE VECTORS; FIRST ORDERS; GAUSSIAN MIXTURES; LINEAR DYNAMIC MODELS (LDMS); LINEAR STATE-SPACE MODELS; OUTPUT DISTRIBUTIONS; PHONE RECOGNITION; SEGMENT MODELS; SPATIAL CORRELATIONS; STACK DECODING; STATIC MODELS; UNDERLYING DYNAMICS;

EID: 34547549792     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2006.876766     Document Type: Article
Times cited : (26)

References (28)
  • 1
    • 33846687633 scopus 로고    scopus 로고
    • Linear Dynamic Models for Automatic Speech Recognition,
    • Ph.D. dissertation, The Centre for Speech Technology Research, Univ. of Edinburgh, Edinburgh, U.K
    • J. Frankel, "Linear Dynamic Models for Automatic Speech Recognition," Ph.D. dissertation, The Centre for Speech Technology Research, Univ. of Edinburgh, Edinburgh, U.K., 2003.
    • (2003)
    • Frankel, J.1
  • 2
    • 0003938589 scopus 로고
    • Segment-Based Stochastic Models of Spectral Dynamics for Continuous Speech Recognition,
    • Ph.D. dissertation, Boston Univ. Graduate School, Boston, MA
    • V. Digalakis, "Segment-Based Stochastic Models of Spectral Dynamics for Continuous Speech Recognition," Ph.D. dissertation, Boston Univ. Graduate School, Boston, MA, 1992.
    • (1992)
    • Digalakis, V.1
  • 3
    • 0033556862 scopus 로고    scopus 로고
    • A unifying review of linear Gaussian models
    • S. Roweis and Z. Ghahramani, "A unifying review of linear Gaussian models," Neural Comput., vol. 11, no. 2, 1999.
    • (1999) Neural Comput , vol.11 , Issue.2
    • Roweis, S.1    Ghahramani, Z.2
  • 4
    • 64149125858 scopus 로고    scopus 로고
    • Generalised Linear Gaussian Models Cambridge Univ. Engineering, Cambridge, U.K
    • Tech. Rep. CUED/F-INFENG/ TR.420
    • A. Rosti and M. Gales, Generalised Linear Gaussian Models Cambridge Univ. Engineering, Cambridge, U.K., Tech. Rep. CUED/F-INFENG/ TR.420, 2001.
    • (2001)
    • Rosti, A.1    Gales, M.2
  • 5
    • 85024429815 scopus 로고
    • A new approach to linear filtering and prediction problems
    • Mar
    • R. Kalman, "A new approach to linear filtering and prediction problems," J. Basic Eng., vol. 82, pp. 35-44, Mar. 1960.
    • (1960) J. Basic Eng , vol.82 , pp. 35-44
    • Kalman, R.1
  • 6
    • 84937741903 scopus 로고
    • Solutions to the linear smoothing problem
    • H. E. Rauch, "Solutions to the linear smoothing problem," IEEE Trans. Automat. Contr., vol. 8, pp. 371-372, 1963.
    • (1963) IEEE Trans. Automat. Contr , vol.8 , pp. 371-372
    • Rauch, H.E.1
  • 7
    • 0027681974 scopus 로고
    • ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition
    • Oct
    • V. Digalakis, J. Rohlicek, and M. Ostendorf, "ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition," IEEE Trans. Speech Audio Process., vol. 1, no. 4, pp. 431-442, Oct. 1993.
    • (1993) IEEE Trans. Speech Audio Process , vol.1 , Issue.4 , pp. 431-442
    • Digalakis, V.1    Rohlicek, J.2    Ostendorf, M.3
  • 8
    • 0027261926 scopus 로고
    • Speech recognition using dynamical model of speech production
    • Minneapolis, MN
    • K. Iso, "Speech recognition using dynamical model of speech production," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, Minneapolis, MN, 1993, vol. 2, pp. 283-286.
    • (1993) Proc. Int. Conf. Acoustics, Speech, and Signal Processing , vol.2 , pp. 283-286
    • Iso, K.1
  • 11
    • 0001523807 scopus 로고    scopus 로고
    • A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech
    • J. Ma and L. Deng, "A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech," Comput. Speech and Lang., vol. 14, no. 2, pp. 101-114, 2000.
    • (2000) Comput. Speech and Lang , vol.14 , Issue.2 , pp. 101-114
    • Ma, J.1    Deng, L.2
  • 12
    • 0033623527 scopus 로고    scopus 로고
    • Spontaneous speech recognition using a statistical coarticulatory model for the vocal-tract-resonance dynamics
    • Dec
    • L. Deng and J. Ma, "Spontaneous speech recognition using a statistical coarticulatory model for the vocal-tract-resonance dynamics," J. Acoust. Soc. Amer., vol. 108, no. 6, pp. 3036-3048, Dec. 2000.
    • (2000) J. Acoust. Soc. Amer , vol.108 , Issue.6 , pp. 3036-3048
    • Deng, L.1    Ma, J.2
  • 13
    • 0742307392 scopus 로고    scopus 로고
    • Target-directed mixture linear dynamic models for spontaneous speech recognition
    • Jan
    • J. Ma and L. Deng, "Target-directed mixture linear dynamic models for spontaneous speech recognition," IEEE Trans. Speech Audio Process., vol. 12, no. 1, pp. 47-58, Jan. 2004.
    • (2004) IEEE Trans. Speech Audio Process , vol.12 , Issue.1 , pp. 47-58
    • Ma, J.1    Deng, L.2
  • 14
    • 0347761233 scopus 로고    scopus 로고
    • A mixed-level switching dynamic system for continuous speech recognition
    • -, "A mixed-level switching dynamic system for continuous speech recognition," Comput. Speech Lang., vol. 18, pp. 49-65, 2004.
    • (2004) Comput. Speech Lang , vol.18 , pp. 49-65
    • Ma, J.1    Deng, L.2
  • 15
    • 0141702226 scopus 로고    scopus 로고
    • Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM-MAP decoding and evaluation
    • Hong Kong, China
    • F. Seide, J. Zhou, and L. Deng, "Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM-MAP decoding and evaluation," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, Hong Kong, China, 2003, vol. 1, pp. 748-751.
    • (2003) Proc. Int. Conf. Acoustics, Speech, and Signal Processing , vol.1 , pp. 748-751
    • Seide, F.1    Zhou, J.2    Deng, L.3
  • 16
    • 15844394960 scopus 로고    scopus 로고
    • Linear Gaussian Models for Speech Recognition,
    • Ph.D. dissertation, Engineering Department, Cambridge Univ, Cambridge, U.K
    • A.-V. I. Rosti, "Linear Gaussian Models for Speech Recognition," Ph.D. dissertation, Engineering Department, Cambridge Univ., Cambridge, U.K., 2004.
    • (2004)
    • Rosti, A.-V.I.1
  • 17
    • 0002583871 scopus 로고
    • Speech database development: Design and analysis of the acoustic-phonetic corpus
    • Palo Alto, CA, Feb
    • L. Lamel, R. Kassel, and S. Seneff, "Speech database development: design and analysis of the acoustic-phonetic corpus," in Proc. Speech Recognition Workshop, Palo Alto, CA, Feb. 1986, pp. 100-109.
    • (1986) Proc. Speech Recognition Workshop , pp. 100-109
    • Lamel, L.1    Kassel, R.2    Seneff, S.3
  • 18
    • 0024768209 scopus 로고
    • Speaker-independent phone recognition using hidden Markov models
    • Nov
    • K. Lee and H. Hon, "Speaker-independent phone recognition using hidden Markov models," IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 11, pp. 1641-1648, Nov. 1989.
    • (1989) IEEE Trans. Acoust., Speech, Signal Process , vol.37 , Issue.11 , pp. 1641-1648
    • Lee, K.1    Hon, H.2
  • 20
    • 84935113569 scopus 로고    scopus 로고
    • A. Viterbi, Error bounds for convolutional codes and an asymptotically optimal decoding algorithm, IEEE Trans. Inform. Process., 13, pp. 260-269, 1967.
    • A. Viterbi, "Error bounds for convolutional codes and an asymptotically optimal decoding algorithm," IEEE Trans. Inform. Process., vol. 13, pp. 260-269, 1967.
  • 21
    • 64149088172 scopus 로고    scopus 로고
    • S.Young, G. Evermann, D.Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.Woodland, The HTK Book for HTK Version 3.2, Cambridge, U.K, Cambridge Univ, 2002
    • S.Young, G. Evermann, D.Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.Woodland, The HTK Book (for HTK Version 3.2). Cambridge, U.K.: Cambridge Univ., 2002.
  • 23
    • 85017287102 scopus 로고
    • An efficient A stack decoder algorithm for continuous speech recognition with a stochastic language model
    • San Francisco, CA
    • D. Paul, "An efficient A stack decoder algorithm for continuous speech recognition with a stochastic language model," in Proc. ICASSP, San Francisco, CA, 1992, vol. 1, pp. 25-28.
    • (1992) Proc. ICASSP , vol.1 , pp. 25-28
    • Paul, D.1
  • 25
    • 64149121404 scopus 로고    scopus 로고
    • S. Renals and M. Hochberg, Decoder Technology for Connectionist Large Vocabulary Speech Recognition Dept. Comput. Sci., Univ. Sheffield, Sheffield, U.K., Tech. Rep. +CS-95-17, 1995.
    • S. Renals and M. Hochberg, Decoder Technology for Connectionist Large Vocabulary Speech Recognition Dept. Comput. Sci., Univ. Sheffield, Sheffield, U.K., Tech. Rep. +CS-95-17, 1995.
  • 26
    • 0003712010 scopus 로고    scopus 로고
    • A General Method for Approximating Nonlinear Transformations of Probability Distributions Dept. Eng. Sci., Univ. Oxford, Oxford, U.K
    • Tech. Rep
    • S. Julier and J. Uhlmann, A General Method for Approximating Nonlinear Transformations of Probability Distributions Dept. Eng. Sci., Univ. Oxford, Oxford, U.K., 1996, Tech. Rep..
    • (1996)
    • Julier, S.1    Uhlmann, J.2
  • 27
    • 0030263447 scopus 로고    scopus 로고
    • Mean and variance adaptation within the MLLR framework
    • M. Gales and P. Woodland, "Mean and variance adaptation within the MLLR framework," Comput., Speech and Lang., vol. 10, pp. 249-264, 1996.
    • (1996) Comput., Speech and Lang , vol.10 , pp. 249-264
    • Gales, M.1    Woodland, P.2
  • 28
    • 0033677172 scopus 로고    scopus 로고
    • Factored sparse inverse covariance matrices
    • J. Bilmes, "Factored sparse inverse covariance matrices," in Proc. ICASSP 2000.
    • Proc. ICASSP 2000
    • Bilmes, J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.