SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 14, Issue 3, 2006, Pages 855-872

Automatic determination of acoustic model topology using variational bayesian estimation and clustering for large vocabulary continuous speech recognition

(3) Watanabe, Shinji a Sako, Atsushi b Nakamura, Atsushi a

a NTT Communication Science Laboratories (Japan)

b KOBE UNIVERSITY (Japan)

Author keywords

Determination of acoustic model topologies; Speech recognition; Variational bayes; Variational bayesian estimation and clustering (VBEC)

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; GAUSSIAN DISTRIBUTION; LEARNING SYSTEMS; MARKOV PROCESSES; PARAMETER ESTIMATION; SPEECH RECOGNITION;

DETERMINATION OF ACOUSTIC MODEL TOPOLOGIES; GAUSSIAN MIXTURE MODELS (GMM); VARIATIONAL BAYES; VARIATIONAL BAYESIAN ESTIMATION AND CLUSTERING (VBEC);

TOPOLOGY;

EID: 33646418145 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TSA.2005.857791 Document Type: Article

Times cited : (14)

References (23)

1
- 0003805597
- The Use of Context in Large Vocabulary Speech Recognition,
- Ph.D. dissertation, Cambridge Univ, Cambridge, U.K
- J. Odell, "The Use of Context in Large Vocabulary Speech Recognition," Ph.D. dissertation, Cambridge Univ., Cambridge, U.K., 1995.
- (1995)
- Odell, J.¹

2
- 0030715097
- HMM topology design using maximum likelihood successive state splitting
- M. Ostendorf and H. Singer, "HMM topology design using maximum likelihood successive state splitting," Comput. Speech Lang., vol. 11, pp. 17-41, 1997.
- (1997) Comput. Speech Lang , vol.11 , pp. 17-41
- Ostendorf, M.¹ Singer, H.²

3
- 0034849093
- Efficient mixture Gaussian synthesis for decision tree based state tying
- T. Kato, S. Kuroiwa, T. Shimizu, and N. Higuchi, "Efficient mixture Gaussian synthesis for decision tree based state tying," in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP2001), vol. 1, 2001, pp. 493-496.
- (2001) Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP2001) , vol.1 , pp. 493-496
- Kato, T.¹ Kuroiwa, S.² Shimizu, T.³ Higuchi, N.⁴

4
- 85135145174
- Acoustic modeling based on the MDL criterion for speech recognition
- K. Shinoda and T. Watanabe, "Acoustic modeling based on the MDL criterion for speech recognition," in Proc. Eurospeech 1997, vol. 1, 1997, pp. 99-102.
- (1997) Proc. Eurospeech 1997 , vol.1 , pp. 99-102
- Shinoda, K.¹ Watanabe, T.²

5
- 0032658258
- Decision tree state tying based on penalized Bayesian information criterion
- W. Chou and W. Reichl, "Decision tree state tying based on penalized Bayesian information criterion," in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP1999), vol. 1, 1999, pp. 345-348.
- (1999) Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP1999) , vol.1 , pp. 345-348
- Chou, W.¹ Reichl, W.²

6
- 0033906251
- MDL-based context-dependent subword modeling for speech recognition
- K. Shinoda and T. Watanabe, "MDL-based context-dependent subword modeling for speech recognition," J. Acoust. Soc. Jpn. E, vol. 21, pp. 79-86, 2000.
- (2000) J. Acoust. Soc. Jpn. E , vol.21 , pp. 79-86
- Shinoda, K.¹ Watanabe, T.²

7
- 0009685440
- Model selection in acoustic modeling
- S. Chen and R. Gopinath, "Model selection in acoustic modeling," in 1 Proc. Eurospeech 1999, vol. 3, 1999, pp. 1087-1090.
- (1999) 1 Proc. Eurospeech 1999 , vol.3 , pp. 1087-1090
- Chen, S.¹ Gopinath, R.²

8
- 0141703325
- Automatic complexity control for HLDA systems
- X. Liu, M. Gales, and P. Woodland, "Automatic complexity control for HLDA systems," in Proc. Int. Conf. Spoken Language Processing (ICASSP 2003), vol. 1, 2003, pp. 568-571.
- (2003) Proc. Int. Conf. Spoken Language Processing (ICASSP 2003) , vol.1 , pp. 568-571
- Liu, X.¹ Gales, M.² Woodland, P.³

9
- 34047275502
- Kyoto, Japan: MTT
- S. Watanabe, Y. Minami, A. Nakamura, and N. Ueda, Application of Vocational Bayesian Approach to Speech Recognition, Neural Information Processing Systems (NIPS 2002). Kyoto, Japan: MTT, 2002.
- (2002) Application of Vocational Bayesian Approach to Speech Recognition, Neural Information Processing Systems (NIPS 2002)
- Watanabe, S.¹ Minami, Y.² Nakamura, A.³ Ueda, N.⁴

10
- 3042741069
- Variational Bayesian estimation and clustering for speech recognition
- Jul
- _, "Variational Bayesian estimation and clustering for speech recognition," IEEE Trans. Speech Audio Processing, vol. 12, pp. 365-381, Jul. 2004.
- (2004) IEEE Trans. Speech Audio Processing , vol.12 , pp. 365-381
- Watanabe, S.¹ Minami, Y.² Nakamura, A.³ Ueda, N.⁴

11
- 34047245710
- Cambridge, MA: MIT Press
- S. Waterhouse, D. MacKay, and T. Robinson, Bayesian Methods for Mixtures of Experts. Neural Information Processing Systems (NIPS 7). Cambridge, MA: MIT Press, 1995.
- (1995) Bayesian Methods for Mixtures of Experts. Neural Information Processing Systems (NIPS 7)
- Waterhouse, S.¹ MacKay, D.² Robinson, T.³

12
- 0003278032
- Inferring parameters and structure of latent variable models by variational Bayes
- H. Attias, "Inferring parameters and structure of latent variable models by variational Bayes," in Proc. Uncertainty in Artificial Intelligence (UAI 15), 1999.
- (1999) Proc. Uncertainty in Artificial Intelligence (UAI 15)
- Attias, H.¹

13
- 0036887504
- Bayesian model search for mixture models based on optimizing variational bounds
- N. Ueda and Z. Ghahramani, "Bayesian model search for mixture models based on optimizing variational bounds," Neural Networks, vol. 15, pp. 1223-1241, 2002.
- (2002) Neural Networks , vol.15 , pp. 1223-1241
- Ueda, N.¹ Ghahramani, Z.²

14
- 85009237883
- Speech modeling using variational Bayesian mixture of Gaussians
- P. Somervuo, "Speech modeling using variational Bayesian mixture of Gaussians," in Proc. Int. Conf. Spoken Language Processing (ICSLP 2002), vol. 2, 2002, pp. 1245-1248.
- (2002) Proc. Int. Conf. Spoken Language Processing (ICSLP , vol.2 , pp. 1245-1248
- Somervuo, P.¹

15
- 4544260276
- Bayesian modeling of the speech spectrum using mixture of Gaussians
- P. Zolfaghari, S. Watanabe, A. Nakamura, and S. Katagiri, "Bayesian modeling of the speech spectrum using mixture of Gaussians," in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 2004), vol. 1, 2004, pp. 553-556.
- (2004) Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 2004) , vol.1 , pp. 553-556
- Zolfaghari, P.¹ Watanabe, S.² Nakamura, A.³ Katagiri, S.⁴

16
- 4544253566
- T. Jitsuhiro and S. Nakamura, Automatic generation of nonuniform HMM structures based on variational Bayesian approach, in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 2004), 1, 2004, pp. 805-808.
- T. Jitsuhiro and S. Nakamura, "Automatic generation of nonuniform HMM structures based on variational Bayesian approach," in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 2004), vol. 1, 2004, pp. 805-808.

17
- 4544387676
- Automatic determination of acoustic model topology using variational Bayesian estimation and clustering
- S. Watanabe, A. Sako, and A. Nakamura, "Automatic determination of acoustic model topology using variational Bayesian estimation and clustering," in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 2004), vol. 1, 2004, pp. 813-816.
- (2004) Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 2004) , vol.1 , pp. 813-816
- Watanabe, S.¹ Sako, A.² Nakamura, A.³

18
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Apr
- J.-L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Processing, vol. 2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 291-298
- Gauvain, J.-L.¹ Lee, C.-H.²

19
- 0035279111
- A structural Bayes approach to speaker adaptation
- Mar
- K. Shinoda and C.-H. Lee, "A structural Bayes approach to speaker adaptation," IEEE Trans. Speech Audio Processing, vol. 9, pp. 276-287, Mar. 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 276-287
- Shinoda, K.¹ Lee, C.-H.²

20
- 0022185407
- Context-dependent modeling for acoustic-phonetic recognition of continuous speech
- R. Schwartz, Y. Chow, O. Kimball, S. Roucos, M. Krasner, and J. Makhoul, "Context-dependent modeling for acoustic-phonetic recognition of continuous speech," in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 1985), 1985, pp. 1205-1208.
- (1985) Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing (ICASSP 1985) , pp. 1205-1208
- Schwartz, R.¹ Chow, Y.² Kimball, O.³ Roucos, S.⁴ Krasner, M.⁵ Makhoul, J.⁶

21
- 33645758265
- NTT Speech recognizer with outlook on the next generation: SOLON
- T. Hori, "NTT Speech recognizer with outlook on the next generation: SOLON," in Proc. NTT Workshop on Communication Scene Analysis, vol. 1, 2004.
- (2004) Proc. NTT Workshop on Communication Scene Analysis , vol.1
- Hori, T.¹

22
- 34047255331
- Japanese Dictation Toolkit, Free Software Repository for Automatic Speech Recognition
- K. Shikano et al., Japanese Dictation Toolkit - Free Software Repository for Automatic Speech Recognition, 1999.
- (1999)
- Shikano, K.¹

23
- 34047271693
- Robustness of acoustic model topology determined by variational Bayesian estimation and clustering for speech recognition for different speech data sets
- S. Watanabe and A. Nakamura, "Robustness of acoustic model topology determined by variational Bayesian estimation and clustering for speech recognition for different speech data sets," in Proc. lEICE Int. Workshop of Beyond HMM, SP2004-90, 2004, pp. 55-60.
- (2004) Proc. lEICE Int. Workshop of Beyond HMM, SP2004-90 , pp. 55-60
- Watanabe, S.¹ Nakamura, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.