메뉴 건너뛰기




Volumn 11, Issue 6, 2003, Pages 558-567

Bayesian Learning of Speech Duration Models

Author keywords

Adaptive duration model; Conjugate prior; Gamma distribution; Quasi Bayes estimate; Sequential learning; Speaking rate; Speech recognition

Indexed keywords

ACOUSTIC NOISE; ALGORITHMS; BROADCASTING; MARKOV PROCESSES; PROBABILITY; SPEECH ANALYSIS;

EID: 0347968278     PISSN: 10636676     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSA.2003.818114     Document Type: Article
Times cited : (26)

References (36)
  • 3
    • 0030149810 scopus 로고    scopus 로고
    • Robust parametric modeling of durations in hidden Markov models
    • May
    • D. Burshtein, "Robust parametric modeling of durations in hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 4, pp. 240-242, May 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 240-242
    • Burshtein, D.1
  • 4
    • 0001365754 scopus 로고    scopus 로고
    • Online hierarchical transformation of hidden Markov models for speech recognition
    • J.-T. Chien, "Online hierarchical transformation of hidden Markov models for speech recognition," IEEE Trans. Speech Audio Processing, vol. 7, no. 6, pp. 656-667, 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.6 , pp. 656-667
    • Chien, J.-T.1
  • 5
    • 0035340701 scopus 로고    scopus 로고
    • Transformation-based Bayesian predictive classification using online prior evolution
    • J.-T. Chien and G.-H. Liao, "Transformation-based Bayesian predictive classification using online prior evolution," IEEE Trans. Speech Audio Processing, vol. 9, no. 4, pp. 399-410, 2001.
    • (2001) IEEE Trans. Speech Audio Processing , vol.9 , Issue.4 , pp. 399-410
    • Chien, J.-T.1    Liao, G.-H.2
  • 7
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc. B, vol. 39, pp. 1-38, 1977.
    • (1977) J. R. Statist. Soc. B , vol.39 , pp. 1-38
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 8
    • 85009274797 scopus 로고    scopus 로고
    • One use of duration modeling for continuous digits speech recognition
    • R. Dong and J. Zhu, "One use of duration modeling for continuous digits speech recognition," in Proc. Int. Con. Spoken Language Processing (ICSLP), 2002, pp. 385-388.
    • (2002) Proc. Int. Con. Spoken Language Processing (ICSLP) , pp. 385-388
    • Dong, R.1    Zhu, J.2
  • 9
    • 85009154899 scopus 로고    scopus 로고
    • Analysis of N-best output hypotheses for fast speech in large vocabulary continuous speech recognition
    • T. Fabian, T. Pfau, and G. Ruske, "Analysis of N-best output hypotheses for fast speech in large vocabulary continuous speech recognition," in Proc. EUROSPEECH, vol. 4, 2001, pp. 2535-2538.
    • (2001) Proc. EUROSPEECH , vol.4 , pp. 2535-2538
    • Fabian, T.1    Pfau, T.2    Ruske, G.3
  • 10
  • 12
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • J. L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Processing, vol. 2, pp. 291-298, 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 291-298
    • Gauvain, J.L.1    Lee, C.-H.2
  • 13
    • 0026203445 scopus 로고
    • Isolated-utterance speech recognition using hidden Markov models with bounded state durations
    • Aug.
    • H. Gu, C. Tseng, and L. Lee, "Isolated-utterance speech recognition using hidden Markov models with bounded state durations," IEEE Trans. Signal Processing, vol. 39, pp. 1743-1752, Aug. 1991.
    • (1991) IEEE Trans. Signal Processing , vol.39 , pp. 1743-1752
    • Gu, H.1    Tseng, C.2    Lee, L.3
  • 14
    • 0347523163 scopus 로고    scopus 로고
    • HMM retraining based on state duration alignment for noisy speech recognition
    • W.-W. Hung and H.-C. Wang, "HMM retraining based on state duration alignment for noisy speech recognition," in Proc. EUROSPEECH, vol. 3, 1997, pp. 1519-1522.
    • (1997) Proc. EUROSPEECH , vol.3 , pp. 1519-1522
    • Hung, W.-W.1    Wang, H.-C.2
  • 15
    • 0031103160 scopus 로고    scopus 로고
    • On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate
    • Mar.
    • Q. Huo and C.-H. Lee, "On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate, " IEEE Trans. Speech Audio Processing, vol. 5, pp. 161-172, Mar. 1997.
    • (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 161-172
    • Huo, Q.1    Lee, C.-H.2
  • 17
    • 85095028711 scopus 로고    scopus 로고
    • Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate
    • H. Kuwabara, "Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate," in Proc. EUROSPEECH, 1997, pp. 1003-1006.
    • (1997) Proc. EUROSPEECH , pp. 1003-1006
    • Kuwabara, H.1
  • 19
    • 0022685753 scopus 로고
    • Continuously variable duration hidden Markov models for automatic speech recognition
    • S. E. Levinson, "Continuously variable duration hidden Markov models for automatic speech recognition," Comput. Speech Lang., vol. 1, pp. 29-45, 1986.
    • (1986) Comput. Speech Lang. , vol.1 , pp. 29-45
    • Levinson, S.E.1
  • 21
    • 85135173867 scopus 로고    scopus 로고
    • Speech recognition using on-line estimation of speaking rate
    • N. Morgan, E. Fosler, and N. Mirghafori, "Speech recognition using on-line estimation of speaking rate," in Proc. EUROSPEECH, 1997, pp. 2079-2082.
    • (1997) Proc. EUROSPEECH , pp. 2079-2082
    • Morgan, N.1    Fosler, E.2    Mirghafori, N.3
  • 24
    • 0022097713 scopus 로고
    • Recognition of isolated digits using hidden Markov models with continuous mixture densities
    • L. R. Rabiner, B.-H. Juang, S. E. Levinson, and M. M. Sondhi, "Recognition of isolated digits using hidden Markov models with continuous mixture densities," AT&T Tech. J., vol. 64, no. 6, pp. 1211-1234, 1985.
    • (1985) AT&T Tech. J. , vol.64 , Issue.6 , pp. 1211-1234
    • Rabiner, L.R.1    Juang, B.-H.2    Levinson, S.E.3    Sondhi, M.M.4
  • 25
    • 33646820149 scopus 로고    scopus 로고
    • Improvement on speech recognition for fast talkers
    • M. Richardson, M. Hwang, A. Acero, and X. D. Huang, "Improvement on speech recognition for fast talkers," in Proc. EUROSPEECH, vol. 1, 1999, pp. 411-414.
    • (1999) Proc. EUROSPEECH , vol.1 , pp. 411-414
    • Richardson, M.1    Hwang, M.2    Acero, A.3    Huang, X.D.4
  • 27
    • 0348153023 scopus 로고    scopus 로고
    • Modeling the rate of speech by Markov processes on curves
    • L. Saul and M. Rahim, "Modeling the rate of speech by Markov processes on curves," in Proc. EUROSPEECH, vol. 1, 1999, pp. 415-418.
    • (1999) Proc. EUROSPEECH , vol.1 , pp. 415-418
    • Saul, L.1    Rahim, M.2
  • 30
    • 0019622592 scopus 로고
    • On articulatory rate and perceptual constancy in phonetic perception
    • Q. Summerfield, "On articulatory rate and perceptual constancy in phonetic perception," J. Exper. Psychol. Hum. Perform., vol. 7, pp. 1074-1095, 1981.
    • (1981) J. Exper. Psychol. Hum. Perform. , vol.7 , pp. 1074-1095
    • Summerfield, Q.1
  • 31
    • 0038533324 scopus 로고    scopus 로고
    • Modeling speaking rate using a between frame distance metric
    • A. Tuerk and S. Young, "Modeling speaking rate using a between frame distance metric," in Proc. EUROSPEECH, vol. 1, 1999, pp. 419-422.
    • (1999) Proc. EUROSPEECH , vol.1 , pp. 419-422
    • Tuerk, A.1    Young, S.2
  • 35
    • 0035249864 scopus 로고    scopus 로고
    • On including temporal constraints in Viterbi alignment for speech recognition in noise
    • Feb.
    • N. B. Yoma, F. R. McInnes, M. A. Jack, S. D. Stump, and L. L. Ling, "On including temporal constraints in Viterbi alignment for speech recognition in noise," IEEE Trans. Speech Audio Processing, vol. 9, pp. 179-182, Feb. 2001.
    • (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 179-182
    • Yoma, N.B.1    McInnes, F.R.2    Jack, M.A.3    Stump, S.D.4    Ling, L.L.5
  • 36
    • 0036815682 scopus 로고    scopus 로고
    • MAP speaker adaptation of state duration distributions for speech recognition
    • Oct.
    • N. B. Yoma and J. S. Sanchez, "MAP speaker adaptation of state duration distributions for speech recognition," IEEE Trans. Speech Audio Processing, vol. 10, pp. 443-450, Oct. 2002.
    • (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 443-450
    • Yoma, N.B.1    Sanchez, J.S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.