SCOPUS 정보 검색 플랫폼

IEEE Journal on Selected Topics in Signal Processing

Volumn 4, Issue 6, 2010, Pages 1007-1015

Unsupervised acoustic model adaptation based on ensemble methods

(3) Shinozaki, Takahiro a Kubota, Yu a Furui, Sadaoki a

a TOKYO INSTITUTE OF TECHNOLOGY (Japan)

Author keywords

Acoustic model; cross validation; ensemble methods; speech recognition; unsupervised adaptation

Indexed keywords

ACOUSTIC MODEL; ACOUSTIC MODEL ADAPTATION; ADAPTATION ALGORITHMS; ADAPTATION FRAMEWORK; COMPUTATIONAL COSTS; CROSS VALIDATION; ENSEMBLE METHODS; LINEAR REGRESSION METHODS; NOISY SPEECH RECOGNITION; RELATIVE REDUCTION; SPEECH RECOGNITION PERFORMANCE; UNSUPERVISED ADAPTATION; WORD ERROR RATE;

ALGORITHMS; MAXIMUM LIKELIHOOD ESTIMATION; PARAMETER ESTIMATION;

SPEECH RECOGNITION;

EID: 78649249806 PISSN: 19324553 EISSN: None Source Type: Journal
DOI: 10.1109/JSTSP.2010.2076010 Document Type: Article

Times cited : (8)

References (30)

1
- 70349227947
- The application of hidden Markov models in speech recognition
- M. Gales and S. Young, "The application of hidden Markov models in speech recognition," Foundat. Trends Signal Process., vol. 1, no. 3, pp. 195-304, 2008.
- (2008) Foundat. Trends Signal Process. , vol.1 , Issue.3 , pp. 195-304
- Gales, M.¹ Young, S.²

2
- 24144432267
- An unsuper-vised speaker adaptation method for lecture-style spontaneous speech recognition using multiple recognition systems
- S. Nakagawa, T. Watanabe, H. nishizaki, and T. Utsuro, "An unsuper-vised speaker adaptation method for lecture-style spontaneous speech recognition using multiple recognition systems," IEICE Trans. Inf. Syst., vol. E88-D, no. 3, pp. 463-471, 2005.
- (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 463-471
- Nakagawa, S.¹ Watanabe, T.² Nishizaki, H.³ Utsuro, T.⁴

3
- 84928746885
- London U.K.: Prentice-Hall
- P. A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach. London, U.K.: Prentice-Hall, 1982.
- (1982) Pattern Recognition: A Statistical Approach
- Devijver, P.A.¹ Kittler, J.²

4
- 0030211964
- Bagging predictors
- L. Breiman, "Bagging predictors," Mach. Learn., vol. 24, no. 2, pp. 123-140, 1996.
- (1996) Mach. Learn. , vol.24 , Issue.2 , pp. 123-140
- Breiman, L.¹

5
- 70349220091
- Unsupervised cross-validation adaptation algorithms for improved adaptation performance
- T. Shinozaki, Y. Kubota, and S. Furui, "Unsupervised cross-validation adaptation algorithms for improved adaptation performance," in Proc. ICASSP, 2009, pp. 4377-4380.
- (2009) Proc. ICASSP , pp. 4377-4380
- Shinozaki, T.¹ Kubota, Y.² Furui, S.³

6
- 0032074539
- Deleted strategy for MMI-based HMM training
- May
- N. S. Kim and C. K. Un, "Deleted strategy for MMI-based HMM training," IEEE Trans. Speech Audio Process., vol. 6, no. 3, pp. 299-303, May 1998.
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.3 , pp. 299-303
- Kim, N.S.¹ Un, C.K.²

7
- 35549000218
- Cross-validation and aggregated em training for robust parameter estimation
- T. Shinozaki and M. Ostendorf, "Cross-validation and aggregated EM training for robust parameter estimation," Comput. Speech Lang., vol. 22, no. 2, pp. 185-195, 2008.
- (2008) Comput. Speech Lang. , vol.22 , Issue.2 , pp. 185-195
- Shinozaki, T.¹ Ostendorf, M.²

8
- 51449090592
- GMM and HMM training by aggregated em algorithm with increased ensemble sizes for robust parameter estimation
- T. Shinozaki and T. Kawahara, "GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation," in Proc. ICASSP, 2008, pp. 4405-4408.
- (2008) Proc. ICASSP , pp. 4405-4408
- Shinozaki, T.¹ Kawahara, T.²

9
- 78649234747
- [Online] Available Jun.
- Machine Learning Ensemble,' [Online] Available: http://en. wikipedia.org/wiki/Machine-learning-ensemble Jun. 2009
- (2009) Machine Learning Ensemble

10
- 85135194048
- Flexible speaker adaptation using maximum likelihood linear regression
- C. J. Leggetter and P. C. Woodland, "Flexible speaker adaptation using maximum likelihood linear regression," in Proc. Eurospeech, 1995, pp. 1155-1158.
- (1995) Proc. Eurospeech , pp. 1155-1158
- Leggetter, C.J.¹ Woodland, P.C.²

11
- 0028419019
- Maximum a posteriori estimation for multi-variate Gaussian mixture observations of Markov chains
- Apr.
- J. Gauvain and C. Lee, "Maximum a posteriori estimation for multi-variate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.¹ Lee, C.²

12
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- Series B 39
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., no. 1, pp. 1-38, 1977, Series B 39.
- (1977) J. R. Statist. Soc. , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

13
- 33646798740
- The IBM 2004 conversational telephony system for rich transcription
- H. Soltau, B. Kingsbury, L. Mangu, D. Povery, G. Saon, and G. Zweig, "The IBM 2004 conversational telephony system for rich transcription," in Proc. ICASSP, 2005, vol. I, pp. 205-208.
- (2005) Proc. ICASSP , vol.1 , pp. 205-208
- Soltau, H.¹ Kingsbury, B.² Mangu, L.³ Povery, D.⁴ Saon, G.⁵ Zweig, G.⁶

14
- 0030638031
- A post-processing system to yield reduced word error rates: Recognizer output voting error reduction rover
- J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recognizer output voting error reduction rover," in Proc. IEEE ASRU, 1997, pp. 347-352.
- (1997) Proc. IEEE ASRU , pp. 347-352
- Fiscus, J.G.¹

15
- 4544253834
- Posterior probability decoding, confidence estimation and system combination
- G. Evermann and P. C. Woodland, "Posterior probability decoding, confidence estimation and system combination," in Proc. Speech Transcript. Workshop, 2000.
- (2000) Proc. Speech Transcript. Workshop
- Evermann, G.¹ Woodland, P.C.²

16
- 3042854734
- Benchmark test for speech recognition using the Corpus of spontaneous Japanese
- T. Kawahara, H. Nanjo, T. Shinozaki, and S. Furui, "Benchmark test for speech recognition using the Corpus of spontaneous Japanese," in Proc. SSPR2003, 2003, pp. 135-138.
- (2003) Proc. SSPR2003 , pp. 135-138
- Kawahara, T.¹ Nanjo, H.² Shinozaki, T.³ Furui, S.⁴

17
- 0032644224
- JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research
- K. Itou, M. Yamamoto, K. Takeda, T. Takezawa, T. Matsuoka, T. Kobayashi, K. Shikano, and S. Itahashi, "JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research," Acoust Soc. Jpn. E, vol. 20, no. 3, pp. 199-206, 1999.
- (1999) Acoust Soc. Jpn. e , vol.20 , Issue.3 , pp. 199-206
- Itou, K.¹ Yamamoto, M.² Takeda, K.³ Takezawa, T.⁴ Matsuoka, T.⁵ Kobayashi, T.⁶ Shikano, K.⁷ Itahashi, S.⁸

18
- 0036296863
- Minimum phone error and I-smoothing for improved discriminative training
- D. Povey and P. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. ICASSP, 2002, vol. I, pp. 105-108.
- (2002) Proc. ICASSP , vol.1 , pp. 105-108
- Povey, D.¹ Woodland, P.²

19
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Aug.
- S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust. Speech, Signal Process., vol. 28, no. 4, pp. 357-366, Aug. 1980.
- (1980) IEEE Trans. Acoust. Speech, Signal Process. , vol.28 , Issue.4 , pp. 357-366
- Davis, S.¹ Mermelstein, P.²

20
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Amer., vol. 87, no. 4, pp. 1738-1752, 1990.
- (1990) J. Acoust. Soc. Amer. , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

21
- 0022667694
- Speaker-independent isolated word recognition using dynamic features of speech spectrum
- Feb.
- S. Furui, "Speaker-independent isolated word recognition using dynamic features of speech spectrum," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-34, no. 1, pp. 52-59, Feb. 1986.
- (1986) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-34 , Issue.1 , pp. 52-59
- Furui, S.¹

22
- 78049411030
- Physical parameter analysis of Japanese speech at automobile driving act, and analysis of correlation between the parameter and accuracy
- Japanese
- T. Kato, J. Okamoto, and M. Shozakai, "Physical parameter analysis of Japanese speech at automobile driving act, and analysis of correlation between the parameter and accuracy," in Autumn Meeting Acoust. Soc. Jpn. (in Japanese), 2007, vol. 3-Q-25, pp. 267-268.
- (2007) Autumn Meeting Acoust. Soc. Jpn. , vol.3 Q-25 , pp. 267-268
- Kato, T.¹ Okamoto, J.² Shozakai, M.³

23
- 0032668522
- On recent speech corpora activities in japan
- S. Itahashi, "On recent speech corpora activities in japan," J. Acoust. Soc. Jpn. (E), vol. 20, no. 3, pp. 163-169, 1999.
- (1999) J. Acoust. Soc. Jpn. (E) , vol.20 , Issue.3 , pp. 163-169
- Itahashi, S.¹

24
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Apr.
- S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
- (1979) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

25
- 44849131087
- The TITECH large vocabulary WFST speech recognition system
- P. R. Dixon, D. A. Caseiro, T. Oonishi, and S. Furui, "The TITECH large vocabulary WFST speech recognition system," in Proc. IEEE ASRU, 2007, pp. 443-448.
- (2007) Proc. IEEE ASRU , pp. 443-448
- Dixon, P.R.¹ Caseiro, D.A.² Oonishi, T.³ Furui, S.⁴

26
- 85128340946
- An efficient two-pass search algorithm using word trellis index
- A. Lee, T. Kawahara, and S. Doshita, "An efficient two-pass search algorithm using word trellis index," in Proc. ICSLP, 1998, pp. 1831-1834.
- (1998) Proc. ICSLP , pp. 1831-1834
- Lee, A.¹ Kawahara, T.² Doshita, S.³

27
- 0003822743
- Cambridge, U.K.: Cambridge Univ. Eng. Dept.
- S. Young et al., The HTK Book. Cambridge, U.K.: Cambridge Univ. Eng. Dept., 2005.
- (2005) The HTK Book
- Young, S.¹

28
- 0024909979
- Some statistical issues in the comparison of speech recognition algorithms
- L. Gillick and S. Cox, "Some statistical issues in the comparison of speech recognition algorithms," in Proc. ICASSP, vol. 89, pp. 532-535.
- Proc. ICASSP , vol.89 , pp. 532-535
- Gillick, L.¹ Cox, S.²

29
- 15844411850
- Confidence measures for speech recognition: A survey
- H. Jiang, "Confidence measures for speech recognition: A survey," Speech Commun., vol. 45, no. 4, pp. 455-470, 2005.
- (2005) Speech Commun. , vol.45 , Issue.4 , pp. 455-470
- Jiang, H.¹

30
- 0034841730
- Investigating lightly supervised acoustic model training
- L. Lamel, J. Gauvain, and G. Adda, "Investigating lightly supervised acoustic model training," in Proc. ICASSP, 2001, vol. 1, pp. 477-480.
- (2001) Proc. ICASSP , vol.1 , pp. 477-480
- Lamel, L.¹ Gauvain, J.² Adda, G.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.