SCOPUS 정보 검색 플랫폼

IEEE Transactions on Speech and Audio Processing

Volumn 11, Issue 6, 2003, Pages 558-567

Bayesian Learning of Speech Duration Models

(2) Chien, Jen Tzung a Huang, Chih Hsien a

a NATIONAL CHENG KUNG UNIVERSITY (Taiwan)

Author keywords

Adaptive duration model; Conjugate prior; Gamma distribution; Quasi Bayes estimate; Sequential learning; Speaking rate; Speech recognition

Indexed keywords

ACOUSTIC NOISE; ALGORITHMS; BROADCASTING; MARKOV PROCESSES; PROBABILITY; SPEECH ANALYSIS;

BAYESIAN LEARNING; GAMMA DISTRIBUTION; SEQUENTIAL LEARNING;

SPEECH RECOGNITION;

EID: 0347968278 PISSN: 10636676 EISSN: None Source Type: Journal
DOI: 10.1109/TSA.2003.818114 Document Type: Article

Times cited : (26)

References (36)

1
- 0004245694
- New York: Dover
- M. Abromovitz and I. A. Stegun, Handbook of Mathematical Functions, New York: Dover, 1965.
- (1965) Handbook of Mathematical Functions
- Abromovitz, M.¹ Stegun, I.A.²

2
- 0028996980
- Duration modeling in large vocabulary speech recognition
- A. Anastasakos, R. Schwartz, and H. Shu, "Duration modeling in large vocabulary speech recognition," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 628-631, 1995.
- (1995) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , pp. 628-631
- Anastasakos, A.¹ Schwartz, R.² Shu, H.³

3
- 0030149810
- Robust parametric modeling of durations in hidden Markov models
- May
- D. Burshtein, "Robust parametric modeling of durations in hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 4, pp. 240-242, May 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 240-242
- Burshtein, D.¹

4
- 0001365754
- Online hierarchical transformation of hidden Markov models for speech recognition
- J.-T. Chien, "Online hierarchical transformation of hidden Markov models for speech recognition," IEEE Trans. Speech Audio Processing, vol. 7, no. 6, pp. 656-667, 1999.
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.6 , pp. 656-667
- Chien, J.-T.¹

5
- 0035340701
- Transformation-based Bayesian predictive classification using online prior evolution
- J.-T. Chien and G.-H. Liao, "Transformation-based Bayesian predictive classification using online prior evolution," IEEE Trans. Speech Audio Processing, vol. 9, no. 4, pp. 399-410, 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , Issue.4 , pp. 399-410
- Chien, J.-T.¹ Liao, G.-H.²

6
- 0003759417
- New York: McGraw-Hill
- M. H. DeGroot, Optimal Statistical Decisions. New York: McGraw-Hill, 1970.
- (1970) Optimal Statistical Decisions
- Degroot, M.H.¹

7
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc. B, vol. 39, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc. B , vol.39 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

8
- 85009274797
- One use of duration modeling for continuous digits speech recognition
- R. Dong and J. Zhu, "One use of duration modeling for continuous digits speech recognition," in Proc. Int. Con. Spoken Language Processing (ICSLP), 2002, pp. 385-388.
- (2002) Proc. Int. Con. Spoken Language Processing (ICSLP) , pp. 385-388
- Dong, R.¹ Zhu, J.²

9
- 85009154899
- Analysis of N-best output hypotheses for fast speech in large vocabulary continuous speech recognition
- T. Fabian, T. Pfau, and G. Ruske, "Analysis of N-best output hypotheses for fast speech in large vocabulary continuous speech recognition," in Proc. EUROSPEECH, vol. 4, 2001, pp. 2535-2538.
- (2001) Proc. EUROSPEECH , vol.4 , pp. 2535-2538
- Fabian, T.¹ Pfau, T.² Ruske, G.³

10
- 85009278870
- Toward the question: Why has speaking rate such an impact on speech recognition performance
- R. Faltlhauser, G. Ruske, and M. Thomae, "Toward the question: Why has speaking rate such an impact on speech recognition performance," in Proc. Int. Conf. Spoken Language Processing (ICSLP), 2002, pp. 2429-2432.
- (2002) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 2429-2432
- Faltlhauser, R.¹ Ruske, G.² Thomae, M.³

11
- 0002585974
- Variable duration models for speech
- J. D. Ferguson, "Variable duration models for speech," in Proc. Symp. Application of Hidden Markov Models to Text and Speech, 1980, pp. 143-179.
- (1980) Proc. Symp. Application of Hidden Markov Models to Text and Speech , pp. 143-179
- Ferguson, J.D.¹

12
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- J. L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Processing, vol. 2, pp. 291-298, 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.-H.²

13
- 0026203445
- Isolated-utterance speech recognition using hidden Markov models with bounded state durations
- Aug.
- H. Gu, C. Tseng, and L. Lee, "Isolated-utterance speech recognition using hidden Markov models with bounded state durations," IEEE Trans. Signal Processing, vol. 39, pp. 1743-1752, Aug. 1991.
- (1991) IEEE Trans. Signal Processing , vol.39 , pp. 1743-1752
- Gu, H.¹ Tseng, C.² Lee, L.³

14
- 0347523163
- HMM retraining based on state duration alignment for noisy speech recognition
- W.-W. Hung and H.-C. Wang, "HMM retraining based on state duration alignment for noisy speech recognition," in Proc. EUROSPEECH, vol. 3, 1997, pp. 1519-1522.
- (1997) Proc. EUROSPEECH , vol.3 , pp. 1519-1522
- Hung, W.-W.¹ Wang, H.-C.²

15
- 0031103160
- On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate
- Mar.
- Q. Huo and C.-H. Lee, "On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate, " IEEE Trans. Speech Audio Processing, vol. 5, pp. 161-172, Mar. 1997.
- (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 161-172
- Huo, Q.¹ Lee, C.-H.²

16
- 0028996845
- Reducing word error rate on conversational speech from the switchboard corpus
- P. Jeanrenaud, E. Eide, U. Chaudhari, J. McDonough, K. Ng, M. Siu, and H. Gish, "Reducing word error rate on conversational speech from the switchboard corpus," IEEE Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 53-56, 1995.
- (1995) IEEE Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , pp. 53-56
- Jeanrenaud, P.¹ Eide, E.² Chaudhari, U.³ McDonough, J.⁴ Ng, K.⁵ Siu, M.⁶ Gish, H.⁷

17
- 85095028711
- Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate
- H. Kuwabara, "Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate," in Proc. EUROSPEECH, 1997, pp. 1003-1006.
- (1997) Proc. EUROSPEECH , pp. 1003-1006
- Kuwabara, H.¹

18
- 0036296951
- Analysis of syllable duration models for Mandarin speech
- W.-H. Lai and S.-H. Chen, "Analysis of syllable duration models for Mandarin speech," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 1, 2002, pp. 497-500.
- (2002) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , vol.1 , pp. 497-500
- Lai, W.-H.¹ Chen, S.-H.²

19
- 0022685753
- Continuously variable duration hidden Markov models for automatic speech recognition
- S. E. Levinson, "Continuously variable duration hidden Markov models for automatic speech recognition," Comput. Speech Lang., vol. 1, pp. 29-45, 1986.
- (1986) Comput. Speech Lang. , vol.1 , pp. 29-45
- Levinson, S.E.¹

20
- 0029748337
- Toward robustness to fast speech in ASR
- N. Mirghafori, E. Fosler, and N. Morgan, "Toward robustness to fast speech in ASR," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 335-338, 1996.
- (1996) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , pp. 335-338
- Mirghafori, N.¹ Fosler, E.² Morgan, N.³

21
- 85135173867
- Speech recognition using on-line estimation of speaking rate
- N. Morgan, E. Fosler, and N. Mirghafori, "Speech recognition using on-line estimation of speaking rate," in Proc. EUROSPEECH, 1997, pp. 2079-2082.
- (1997) Proc. EUROSPEECH , pp. 2079-2082
- Morgan, N.¹ Fosler, E.² Mirghafori, N.³

22
- 0030355725
- Duration modeling for improved connected digit recognition
- K. Power, "Duration modeling for improved connected digit recognition," in Proc. Int. Conf. Spoken Language Processing (ICSLP), 1996, pp. 885-888.
- (1996) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 885-888
- Power, K.¹

23
- 0004244302
- Englewood Cliffs, NJ: Prentice-Hall
- L. R. Rabiner and B.-H Juang, Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1993.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.R.¹ Juang, B.-H.²

24
- 0022097713
- Recognition of isolated digits using hidden Markov models with continuous mixture densities
- L. R. Rabiner, B.-H. Juang, S. E. Levinson, and M. M. Sondhi, "Recognition of isolated digits using hidden Markov models with continuous mixture densities," AT&T Tech. J., vol. 64, no. 6, pp. 1211-1234, 1985.
- (1985) AT&T Tech. J. , vol.64 , Issue.6 , pp. 1211-1234
- Rabiner, L.R.¹ Juang, B.-H.² Levinson, S.E.³ Sondhi, M.M.⁴

25
- 33646820149
- Improvement on speech recognition for fast talkers
- M. Richardson, M. Hwang, A. Acero, and X. D. Huang, "Improvement on speech recognition for fast talkers," in Proc. EUROSPEECH, vol. 1, 1999, pp. 411-414.
- (1999) Proc. EUROSPEECH , vol.1 , pp. 411-414
- Richardson, M.¹ Hwang, M.² Acero, A.³ Huang, X.D.⁴

26
- 0022234383
- Explicit modeling of state occupancy in hidden Markov models for automatic speech recognition
- M. J. Russell and R. K. Moore, "Explicit modeling of state occupancy in hidden Markov models for automatic speech recognition," Proc. IEEE Int. Conf. Acoustic, Speech, Signal Processing (ICASSP), pp. 5-8, 1985.
- (1985) Proc. IEEE Int. Conf. Acoustic, Speech, Signal Processing (ICASSP) , pp. 5-8
- Russell, M.J.¹ Moore, R.K.²

27
- 0348153023
- Modeling the rate of speech by Markov processes on curves
- L. Saul and M. Rahim, "Modeling the rate of speech by Markov processes on curves," in Proc. EUROSPEECH, vol. 1, 1999, pp. 415-418.
- (1999) Proc. EUROSPEECH , vol.1 , pp. 415-418
- Saul, L.¹ Rahim, M.²

28
- 0028996973
- On the effects of speech rate in large vocabulary speech recognition systems
- M. A. Siegler and R. M. Stern, "On the effects of speech rate in large vocabulary speech recognition systems," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 612-615, 1995.
- (1995) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , pp. 612-615
- Siegler, M.A.¹ Stern, R.M.²

29
- 0348206263
- An efficient combination of acoustic and supra-segmental informations in a speech recognition system
- N. Suaudeau and R. Andre-Obrecht, "An efficient combination of acoustic and supra-segmental informations in a speech recognition system," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 1, pp. 65-68, 1994.
- (1994) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , vol.1 , pp. 65-68
- Suaudeau, N.¹ Andre-Obrecht, R.²

30
- 0019622592
- On articulatory rate and perceptual constancy in phonetic perception
- Q. Summerfield, "On articulatory rate and perceptual constancy in phonetic perception," J. Exper. Psychol. Hum. Perform., vol. 7, pp. 1074-1095, 1981.
- (1981) J. Exper. Psychol. Hum. Perform. , vol.7 , pp. 1074-1095
- Summerfield, Q.¹

31
- 0038533324
- Modeling speaking rate using a between frame distance metric
- A. Tuerk and S. Young, "Modeling speaking rate using a between frame distance metric," in Proc. EUROSPEECH, vol. 1, 1999, pp. 419-422.
- (1999) Proc. EUROSPEECH , vol.1 , pp. 419-422
- Tuerk, A.¹ Young, S.²

32
- 0030376403
- A fast and reliable rate of speech detector
- J. P. Verhasselt and J.-P. Martens, "A fast and reliable rate of speech detector," in Proc. Int. Conf. Spoken Language Processing (ICSLP), 1996, pp. 2258-2261.
- (1996) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 2258-2261
- Verhasselt, J.P.¹ Martens, J.-P.²

33
- 0030366666
- Analysis of context-dependent segmental duration for automatic speech recognition
- X. Wang, L. C. W. Pols, and L. F. M. ten Bosch, "Analysis of context-dependent segmental duration for automatic speech recognition," in Proc. Int. Conf. Spoken Language Processing (ICSLP), 1996, pp. 1181-1184.
- (1996) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 1181-1184
- Wang, X.¹ Pols, L.C.W.² Ten Bosch, L.F.M.³

34
- 85009240194
- Automatic user-adaptive speaking rate selection for information delivery
- N. Ward and S. Nakagawa, "Automatic user-adaptive speaking rate selection for information delivery," in Proc. Int. Conf. Spoken Language Processing (ICSLP), 2002, pp. 549-552.
- (2002) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 549-552
- Ward, N.¹ Nakagawa, S.²

35
- 0035249864
- On including temporal constraints in Viterbi alignment for speech recognition in noise
- Feb.
- N. B. Yoma, F. R. McInnes, M. A. Jack, S. D. Stump, and L. L. Ling, "On including temporal constraints in Viterbi alignment for speech recognition in noise," IEEE Trans. Speech Audio Processing, vol. 9, pp. 179-182, Feb. 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 179-182
- Yoma, N.B.¹ McInnes, F.R.² Jack, M.A.³ Stump, S.D.⁴ Ling, L.L.⁵

36
- 0036815682
- MAP speaker adaptation of state duration distributions for speech recognition
- Oct.
- N. B. Yoma and J. S. Sanchez, "MAP speaker adaptation of state duration distributions for speech recognition," IEEE Trans. Speech Audio Processing, vol. 10, pp. 443-450, Oct. 2002.
- (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 443-450
- Yoma, N.B.¹ Sanchez, J.S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.