SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 20, Issue 8, 2012, Pages 2252-2264

Hidden markov acoustic modeling with bootstrap and restructuring for low-resourced languages

(8) Cui, Xiaodong a Xue, Jian a Chen, Xin b Olsen, Peder A a Dognin, Pierre L a Chaudhari, Upendra V a Hershey, John R c Zhou, Bowen a

a IBM T J WATSON RESEARCH CENTER (United States)

b PEARSON (United States)

c MITSUBISHI ELECTRIC RESEARCH LABORATORIES (United States)

Author keywords

Bagging; bootstrap and restructuring; hidden Markov model (HMM); large vocabulary continuous speech recognition (LVCSR); low resourced language

Indexed keywords

ACOUSTIC MODEL; ACOUSTIC MODELING; AUTOMATIC SPEECH RECOGNITION; BAGGING; BOOTSTRAP AND RESTRUCTURING; CLUSTERING CRITERIA; COVARIANCE MODELS; DATA SPARSITY; DECODING SPEED; GAUSSIANS; HIDDEN MARKOV MODELS (HMMS); LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION; LAST STAGE; LOW-RESOURCED LANGUAGE; MEMORY CONSUMPTION; MODEL REFINEMENT; MODEL SIZE; PREDICTION CAPABILITY; REAL-WORLD APPLICATION; RUNTIMES; SEQUENCE PREDICTION; STATISTICAL RELIABILITY; TRAINING DATA; TRAINING PROCEDURES;

CONTINUOUS SPEECH RECOGNITION; HIDDEN MARKOV MODELS; VOCABULARY CONTROL;

AGGREGATES;

EID: 84865265602 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2199982 Document Type: Article

Times cited : (14)

References (44)

1
- 84865238845
- Automatic speech recognition for an under-resourced language - Amharic
- S. T. Abate and W. Menzel, "Automatic speech recognition for an under-resourced language - Amharic," in Proc. Interspeech, 2007, pp. 1541-1544.
- (2007) Proc. Interspeech , pp. 1541-1544
- Abate, S.T.¹ Menzel, W.²

2
- 69249083744
- Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language
- T. Pellegrini and L. Lamel, "Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language," in Proc. Interspeech, 2007, pp. 1797-1800.
- (2007) Proc. Interspeech , pp. 1797-1800
- Pellegrini, T.¹ Lamel, L.²

3
- 69249139569
- Automatic speech recognition for under-resourced languages: Application to vietnamese language
- Nov.
- V.-B. Le and L. Besacier, "Automatic speech recognition for under-resourced languages: Application to Vietnamese language," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 8, pp. 1471-1482, Nov. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process , vol.17 , Issue.8 , pp. 1471-1482
- Le, V.-B.¹ Besacier, L.²

4
- 0035426931
- Language-independent and language-adaptive acoustic modeling for speech recognition
- DOI 10.1016/S0167-6393(00)00094-7, PII S0167639300000947
- T. Schultz and A. Waibel, "Language-independent and language-adaptive acoustic modeling for speech recognition," Speech Commun., vol. 35, pp. 31-51, 2001. (Pubitemid 32599645)
- (2001) Speech Communication , vol.35 , Issue.1-2 , pp. 31-51
- Schultz, T.¹ Waibel, A.²

5
- 0036722707
- Cross-language use of acoustic information for automatic speech recognition
- DOI 10.1016/S0167-6393(01)00046-2, PII S0167639301000462
- C. Nieuwoudt and E. C. Botha, "Cross-language use of acoustic information for automatic speech recognition," Speech Commun., vol. 38, pp. 101-113, 2002. (Pubitemid 34873601)
- (2002) Speech Communication , vol.38 , Issue.1-2 , pp. 101-113
- Nieuwoudt, C.¹ Botha, E.C.²

6
- 85133315126
- Pooling ASR data for closely related languages
- C.V. Heerden, N. Kleynhans, E. Barnard, and M. Davel, "Pooling ASR data for closely related languages," in Proc. Workshop Spoken Lang. Technol. for Under-Resourced Lang. (SLTU), 2010, pp. 17-23.
- (2010) Proc. Workshop Spoken Lang. Technol. for Under-resourced Lang. (SLTU) , pp. 17-23
- Heerden, C.V.¹ Kleynhans, N.² Barnard, E.³ Davel, M.⁴

7
- 85013700737
- New York: Elsevier, Academic
- T. Schultz and K. Kirchhoff, Multilingual Speech Processing. New York: Elsevier, Academic, 2006.
- (2006) Multilingual Speech Processing
- Schultz, T.¹ Kirchhoff, K.²

8
- 0002344794
- Bootstrap methods: Another look at the jackknife
- B. Efron, "Bootstrap methods: Another look at the jackknife," Ann. Statist., vol. 1, no. 1, pp. 1-26, 1979.
- (1979) Ann. Statist. , vol.1 , Issue.1 , pp. 1-26
- Efron, B.¹

9
- 0001077032
- Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods
- B. Efron, "Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods," Biometrika, vol. 68, no. 3, pp. 589-599, 1981.
- (1981) Biometrika , vol.68 , Issue.3 , pp. 589-599
- Efron, B.¹

10
- 0003991665
- Boca Raton, FL: Chapman & Hall/CRC
- B. Efron and R. Tibshirani, An Introduction to the Bootstrap. Boca Raton, FL: Chapman & Hall/CRC, 1993.
- (1993) An Introduction to the Bootstrap
- Efron, B.¹ Tibshirani, R.²

11
- 0003484780
- Cambridge, U.K.: Cambridge Univ. Press
- A. Davison, D. V. Hinkley, and A. Canty, Bootstrap Methods and Their Application. Cambridge, U.K.: Cambridge Univ. Press, 1999.
- (1999) Bootstrap Methods and their Application
- Davison, A.¹ Hinkley, D.V.² Canty, A.³

12
- 6344292088
- Cambridge, U.K.: Cambridge Univ. Press
- A. M. Zoubir and D. R. Iskander, Bootstrap Techniques for Signal Processing. Cambridge, U.K.: Cambridge Univ. Press, 2004.
- (2004) Bootstrap Techniques for Signal Processing
- Zoubir, A.M.¹ Iskander, D.R.²

13
- 0030211964
- Baggging predictors
- L. Breiman, "Baggging predictors," Mach. Learn., vol. 24, no. 2, pp. 123-140, 1996.
- (1996) Mach. Learn. , vol.24 , Issue.2 , pp. 123-140
- Breiman, L.¹

14
- 0030344230
- Heuristics of instability and stabilization in model selection
- L. Breiman, "Heuristics of instability and stabilization in model selection," Ann. Statist., vol. 24, no. 6, pp. 2350-2383, 1996.
- (1996) Ann. Statist. , vol.24 , Issue.6 , pp. 2350-2383
- Breiman, L.¹

15
- 79959843187
- Acoustic modeling with bootstrap and restructuring for low-resourced languages
- X. Cui, J. Xue, P. L. Dognin, U. V. Chaudhari, and B. Zhou, "Acoustic modeling with bootstrap and restructuring for low-resourced languages," in Proc. Interspeech, 2010, pp. 2974-2977.
- (2010) Proc. Interspeech , pp. 2974-2977
- Cui, X.¹ Xue, J.² Dognin, P.L.³ Chaudhari, U.V.⁴ Zhou, B.⁵

16
- 0020719320
- A maximum likelihood approach to continuous speech recognition
- L. R. Bahl, F. Jelinek, and R. L. Mercer, "A maximum likelihood approach to continuous speech recognition," IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-5, no. 2, pp. 179-190, Feb. 1983. (Pubitemid 13555897)
- (1983) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.PAMI-5 , Issue.2 , pp. 179-190
- Bahl, L.R.¹ Jelinek, F.² Mercer, R.L.³

17
- 0003459132
- Ph.D dissertation, McGill Univ., Montreal, QC, Canada
- Y. Normandin, "Hidden Markov models, maximum mutual information estimation and the speech recognition problem," Ph.D dissertation, McGill Univ., Montreal, QC, Canada, 1991.
- (1991) Hidden Markov Models, Maximum Mutual Information Estimation and the Speech Recognition Problem
- Normandin, Y.¹

18
- 51449120120
- Boosted MMI for model and feature-space discriminative training
- D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon, and K. Visweswariah, "Boosted MMI for model and feature-space discriminative training," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2008, pp. 4057-4060.
- (2008) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 4057-4060
- Povey, D.¹ Kanevsky, D.² Kingsbury, B.³ Ramabhadran, B.⁴ Saon, G.⁵ Visweswariah, K.⁶

19
- 4544265717
- Ph.D dissertation, Univ. of Cambridge, Cambridge, U.K.
- D. Povey, "Discriminative training for large vocabulary speech recognition," Ph.D dissertation, Univ. of Cambridge, Cambridge, U.K., 2003.
- (2003) Discriminative Training for Large Vocabulary Speech Recognition
- Povey, D.¹

20
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb.
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

21
- 0043289776
- Analyzing bagging
- DOI 10.1214/aos/1031689014
- P. Bühlmann and B. Yu, "Analyzing bagging," Ann. Statist., vol. 30, no. 4, pp. 927-961, 2002. (Pubitemid 37095335)
- (2002) Annals of Statistics , vol.30 , Issue.4 , pp. 927-961
- Buhlmann, P.¹ Yu, B.²

22
- 76249101406
- Effect of subsampling rate on subbagging and related ensembles of stable classifiers
- F. Zaman and H. Hirose, "Effect of subsampling rate on subbagging and related ensembles of stable classifiers," in Proc. Int. Conf. Pattern Recogn. Mach. Intell., 2009, pp. 44-49.
- (2009) Proc. Int. Conf. Pattern Recogn. Mach. Intell. , pp. 44-49
- Zaman, F.¹ Hirose, H.²

23
- 21844448886
- Stability of randomized learning algorithms
- A. Elisseeff, T. Evgeniou, and M. Pontil, "Stability of randomized learning algorithms," J. Mach. Learn. Res., vol. 6, pp. 55-79, 2005.
- (2005) J. Mach. Learn. Res. , vol.6 , pp. 55-79
- Elisseeff, A.¹ Evgeniou, T.² Pontil, M.³

24
- 64149085496
- Automatic model complexity control using marginalized discriminative growth functions
- May
- X. Liu and M. Gales, "Automatic model complexity control using marginalized discriminative growth functions," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1414-1424, May 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.4 , pp. 1414-1424
- Liu, X.¹ Gales, M.²

25
- 0033884712
- Model complexity adaptation using a discriminant measure
- DOI 10.1109/89.824707
- M. Padmanabhan and L. R. Bahl, "Model complexity adaptation using a discriminant measure," IEEE Trans. Speech Audio Process., vol. 8, no. 2, pp. 205-208, Mar. 2000. (Pubitemid 30578375)
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.2 , pp. 205-208
- Padmanabhan, M.¹ Bahl, L.R.²

26
- 33745205656
- Gaussian elimination algorithm for HMM complexity reduction in continuous speech recognition systems
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- G. F. G. Yared, F. Violaro, and L. C. Sousa, "Gaussian elimination algorithm for HMM complexity reduction in continuous speech recognition systems," in Proc. Interspeech, 2005, pp. 377-380. (Pubitemid 43908078)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 377-380
- Yared, G.F.G.¹ Violaro, F.² Sousa, L.C.³

27
- 0029747193
- Speaker adaptation with autonomous model complexity control by MDL principle
- K. Shinoda and T. Watanabe, "Speaker adaptation with autonomous model complexity control by MDL principle," in Proc. Int. Conf. Acoust., Speech, Signal Process., 1995, pp. 717-720.
- (1995) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 717-720
- Shinoda, K.¹ Watanabe, T.²

28
- 85009136681
- Model complexity optimization for nonnative English speakers
- X. He and Y. Zhao, "Model complexity optimization for nonnative English speakers," in Proc. Interspeech, 2001, pp. 1461-1464.
- (2001) Proc. Interspeech , pp. 1461-1464
- He, X.¹ Zhao, Y.²

29
- 77955091542
- Methods for merging Gaussian mixture components
- C. Hennig, "Methods for merging gaussian mixture components," Adv. Data Anal. Classific., vol. 4, no. 1, pp. 3-34, 2010.
- (2010) Adv. Data Anal. Classific. , vol.4 , Issue.1 , pp. 3-34
- Hennig, C.¹

30
- 0141702082
- Structural speaker adaptation using maximum a posteriori approach and a Gaussian distributions merging technique
- O. Bellot, D. Matrouf, P. Nocera, G. Linares, and J.-F. Bonastre, "Structural speaker adaptation using maximum a posteriori approach and a Gaussian distributions merging technique," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2003, pp. 121-124.
- (2003) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 121-124
- Bellot, O.¹ Matrouf, D.² Nocera, P.³ Linares, G.⁴ Bonastre, J.-F.⁵

31
- 66249107761
- A new approach to merging Gaussian densities in large vocabulary continuous speech recognition
- W. Xu, J. Duchateau, K. Demuynck, and I. Dologlou, "A new approach to merging Gaussian densities in large vocabulary continuous speech recognition," in Proc. IEEE Benelux Signal Process. Symp., 1998, pp. 231-234.
- (1998) Proc. IEEE Benelux Signal Process. Symp. , pp. 231-234
- Xu, W.¹ Duchateau, J.² Demuynck, K.³ Dologlou, I.⁴

32
- 0004257992
- New York: Wiley
- S. Kullback, Information Theory and Statistics. New York: Wiley, 1959.
- (1959) Information Theory and Statistics
- Kullback, S.¹

33
- 0003922190
- 2nd ed. New York: Wiley
- R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. New York: Wiley, 2001.
- (2001) Pattern Classification
- Duda, R.O.¹ Hart, P.E.² Stork, D.G.³

34
- 80051632257
- Full covariance bootstrapped acoustic model clustering
- X. Chen, X. Cui, J. Xue, P. A. Olsen, J. R. Hershey, and B. Zhou, "Full covariance bootstrapped acoustic model clustering," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2011, pp. 4496-4499.
- (2011) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 4496-4499
- Chen, X.¹ Cui, X.² Xue, J.³ Olsen, P.A.⁴ Hershey, J.R.⁵ Zhou, B.⁶

35
- 70349225968
- Refactoring acoustic models using variational density approximation
- P. L. Dognin, J. R. Hershey, V. Goel, and P. A. Olsen, "Refactoring acoustic models using variational density approximation," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2009, pp. 4473-4476.
- (2009) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 4473-4476
- Dognin, P.L.¹ Hershey, J.R.² Goel, V.³ Olsen, P.A.⁴

36
- 70450191334
- Refactoring acoustic models using variational expectation-maximization
- P. L. Dognin, J. R. Hershey, V. Goel, and P. A. Olsen, "Refactoring acoustic models using variational Expectation-Maximization," in Proc. Interspeech, 2009, pp. 212-215.
- (2009) Proc. Interspeech , pp. 212-215
- Dognin, P.L.¹ Hershey, J.R.² Goel, V.³ Olsen, P.A.⁴

37
- 0003531332
- London, U.K.: Methuen
- J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods. London, U.K.: Methuen, 1975.
- (1975) Monte Carlo Methods
- Hammersley, J.M.¹ Handscomb, D.C.²

38
- 0003489634
- New York: Springer
- G. S. Fishman, Monte Carlo: Concepts, Algorithms, and Applications. New York: Springer, 1995.
- (1995) Monte Carlo: Concepts, Algorithms, and Applications
- Fishman, G.S.¹

39
- 0004080531
- 2nd ed. New York: Wiley
- R. Y. Rubinstein and D. P. Kroese, Simulation and the Mont Carlo Method, 2nd ed. New York: Wiley, 2007.
- (2007) Simulation and the Mont Carlo Method
- Rubinstein, R.Y.¹ Kroese, D.P.²

40
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., Ser. B, vol. 39, no. 1, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc., Ser. B , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

41
- 84865800558
- Acoustic modeling with bootstrap and restructuring based on full covariance
- X. Cui, X. Chen, J. Xue, P. A. Olsen, J. R. Hershey, and B. Zhou, "Acoustic modeling with bootstrap and restructuring based on full covariance," in Proc. Interspeech, 2011, pp. 1697-1700.
- (2011) Proc. Interspeech , pp. 1697-1700
- Cui, X.¹ Chen, X.² Xue, J.³ Olsen, P.A.⁴ Hershey, J.R.⁵ Zhou, B.⁶

42
- 51449111964
- Variational bhattacharyya divergence for hidden Markov models
- J. R. Hershey and P. A. Olsen, "Variational Bhattacharyya divergence for hidden Markov models," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2008, pp. 4557-4560.
- (2008) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 4557-4560
- Hershey, J.R.¹ Olsen, P.A.²

43
- 33745186926
- Anatomy of an extremely fast LVCSR decoder
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- G. Saon, D. Povey, and G. Zweig, "Anatomy of an extremely fast LVCSR decoder," in Proc. Interspeech, 2005, pp. 549-552. (Pubitemid 43908121)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 549-552
- Saon, G.¹ Povey, D.² Zweig, G.³

44
- 0023829165
- Decoder selection based on cross-entropies
- P. S. Gopalakrishnan, D. Kotievsky, A. Nadas, D. Nahanloo, and M. A. Pichieny, "Decoder selection based on cross-entropies," in Int. Conf. Acoust., Speech, Signal Process., 1988, pp. 20-23.
- (1988) Int. Conf. Acoust., Speech, Signal Process , pp. 20-23
- Gopalakrishnan, P.S.¹ Kotievsky, D.² Nadas, A.³ Nahanloo, D.⁴ Pichieny, M.A.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.