메뉴 건너뛰기




Volumn 4, Issue 6, 2010, Pages 1007-1015

Unsupervised acoustic model adaptation based on ensemble methods

Author keywords

Acoustic model; cross validation; ensemble methods; speech recognition; unsupervised adaptation

Indexed keywords

ACOUSTIC MODEL; ACOUSTIC MODEL ADAPTATION; ADAPTATION ALGORITHMS; ADAPTATION FRAMEWORK; COMPUTATIONAL COSTS; CROSS VALIDATION; ENSEMBLE METHODS; LINEAR REGRESSION METHODS; NOISY SPEECH RECOGNITION; RELATIVE REDUCTION; SPEECH RECOGNITION PERFORMANCE; UNSUPERVISED ADAPTATION; WORD ERROR RATE;

EID: 78649249806     PISSN: 19324553     EISSN: None     Source Type: Journal    
DOI: 10.1109/JSTSP.2010.2076010     Document Type: Article
Times cited : (8)

References (30)
  • 1
    • 70349227947 scopus 로고    scopus 로고
    • The application of hidden Markov models in speech recognition
    • M. Gales and S. Young, "The application of hidden Markov models in speech recognition," Foundat. Trends Signal Process., vol. 1, no. 3, pp. 195-304, 2008.
    • (2008) Foundat. Trends Signal Process. , vol.1 , Issue.3 , pp. 195-304
    • Gales, M.1    Young, S.2
  • 2
    • 24144432267 scopus 로고    scopus 로고
    • An unsuper-vised speaker adaptation method for lecture-style spontaneous speech recognition using multiple recognition systems
    • S. Nakagawa, T. Watanabe, H. nishizaki, and T. Utsuro, "An unsuper-vised speaker adaptation method for lecture-style spontaneous speech recognition using multiple recognition systems," IEICE Trans. Inf. Syst., vol. E88-D, no. 3, pp. 463-471, 2005.
    • (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 463-471
    • Nakagawa, S.1    Watanabe, T.2    Nishizaki, H.3    Utsuro, T.4
  • 4
    • 0030211964 scopus 로고    scopus 로고
    • Bagging predictors
    • L. Breiman, "Bagging predictors," Mach. Learn., vol. 24, no. 2, pp. 123-140, 1996.
    • (1996) Mach. Learn. , vol.24 , Issue.2 , pp. 123-140
    • Breiman, L.1
  • 5
    • 70349220091 scopus 로고    scopus 로고
    • Unsupervised cross-validation adaptation algorithms for improved adaptation performance
    • T. Shinozaki, Y. Kubota, and S. Furui, "Unsupervised cross-validation adaptation algorithms for improved adaptation performance," in Proc. ICASSP, 2009, pp. 4377-4380.
    • (2009) Proc. ICASSP , pp. 4377-4380
    • Shinozaki, T.1    Kubota, Y.2    Furui, S.3
  • 6
    • 0032074539 scopus 로고    scopus 로고
    • Deleted strategy for MMI-based HMM training
    • May
    • N. S. Kim and C. K. Un, "Deleted strategy for MMI-based HMM training," IEEE Trans. Speech Audio Process., vol. 6, no. 3, pp. 299-303, May 1998.
    • (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.3 , pp. 299-303
    • Kim, N.S.1    Un, C.K.2
  • 7
    • 35549000218 scopus 로고    scopus 로고
    • Cross-validation and aggregated em training for robust parameter estimation
    • T. Shinozaki and M. Ostendorf, "Cross-validation and aggregated EM training for robust parameter estimation," Comput. Speech Lang., vol. 22, no. 2, pp. 185-195, 2008.
    • (2008) Comput. Speech Lang. , vol.22 , Issue.2 , pp. 185-195
    • Shinozaki, T.1    Ostendorf, M.2
  • 8
    • 51449090592 scopus 로고    scopus 로고
    • GMM and HMM training by aggregated em algorithm with increased ensemble sizes for robust parameter estimation
    • T. Shinozaki and T. Kawahara, "GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation," in Proc. ICASSP, 2008, pp. 4405-4408.
    • (2008) Proc. ICASSP , pp. 4405-4408
    • Shinozaki, T.1    Kawahara, T.2
  • 9
    • 78649234747 scopus 로고    scopus 로고
    • [Online] Available Jun.
    • Machine Learning Ensemble,' [Online] Available: http://en. wikipedia.org/wiki/Machine-learning-ensemble Jun. 2009
    • (2009) Machine Learning Ensemble
  • 10
    • 85135194048 scopus 로고
    • Flexible speaker adaptation using maximum likelihood linear regression
    • C. J. Leggetter and P. C. Woodland, "Flexible speaker adaptation using maximum likelihood linear regression," in Proc. Eurospeech, 1995, pp. 1155-1158.
    • (1995) Proc. Eurospeech , pp. 1155-1158
    • Leggetter, C.J.1    Woodland, P.C.2
  • 11
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multi-variate Gaussian mixture observations of Markov chains
    • Apr.
    • J. Gauvain and C. Lee, "Maximum a posteriori estimation for multi-variate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.1    Lee, C.2
  • 12
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the em algorithm
    • Series B 39
    • A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., no. 1, pp. 1-38, 1977, Series B 39.
    • (1977) J. R. Statist. Soc. , Issue.1 , pp. 1-38
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 13
    • 33646798740 scopus 로고    scopus 로고
    • The IBM 2004 conversational telephony system for rich transcription
    • H. Soltau, B. Kingsbury, L. Mangu, D. Povery, G. Saon, and G. Zweig, "The IBM 2004 conversational telephony system for rich transcription," in Proc. ICASSP, 2005, vol. I, pp. 205-208.
    • (2005) Proc. ICASSP , vol.1 , pp. 205-208
    • Soltau, H.1    Kingsbury, B.2    Mangu, L.3    Povery, D.4    Saon, G.5    Zweig, G.6
  • 14
    • 0030638031 scopus 로고    scopus 로고
    • A post-processing system to yield reduced word error rates: Recognizer output voting error reduction rover
    • J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recognizer output voting error reduction rover," in Proc. IEEE ASRU, 1997, pp. 347-352.
    • (1997) Proc. IEEE ASRU , pp. 347-352
    • Fiscus, J.G.1
  • 15
    • 4544253834 scopus 로고    scopus 로고
    • Posterior probability decoding, confidence estimation and system combination
    • G. Evermann and P. C. Woodland, "Posterior probability decoding, confidence estimation and system combination," in Proc. Speech Transcript. Workshop, 2000.
    • (2000) Proc. Speech Transcript. Workshop
    • Evermann, G.1    Woodland, P.C.2
  • 16
    • 3042854734 scopus 로고    scopus 로고
    • Benchmark test for speech recognition using the Corpus of spontaneous Japanese
    • T. Kawahara, H. Nanjo, T. Shinozaki, and S. Furui, "Benchmark test for speech recognition using the Corpus of spontaneous Japanese," in Proc. SSPR2003, 2003, pp. 135-138.
    • (2003) Proc. SSPR2003 , pp. 135-138
    • Kawahara, T.1    Nanjo, H.2    Shinozaki, T.3    Furui, S.4
  • 18
    • 0036296863 scopus 로고    scopus 로고
    • Minimum phone error and I-smoothing for improved discriminative training
    • D. Povey and P. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. ICASSP, 2002, vol. I, pp. 105-108.
    • (2002) Proc. ICASSP , vol.1 , pp. 105-108
    • Povey, D.1    Woodland, P.2
  • 19
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Aug.
    • S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust. Speech, Signal Process., vol. 28, no. 4, pp. 357-366, Aug. 1980.
    • (1980) IEEE Trans. Acoust. Speech, Signal Process. , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 20
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Amer., vol. 87, no. 4, pp. 1738-1752, 1990.
    • (1990) J. Acoust. Soc. Amer. , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 21
    • 0022667694 scopus 로고
    • Speaker-independent isolated word recognition using dynamic features of speech spectrum
    • Feb.
    • S. Furui, "Speaker-independent isolated word recognition using dynamic features of speech spectrum," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-34, no. 1, pp. 52-59, Feb. 1986.
    • (1986) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-34 , Issue.1 , pp. 52-59
    • Furui, S.1
  • 22
    • 78049411030 scopus 로고    scopus 로고
    • Physical parameter analysis of Japanese speech at automobile driving act, and analysis of correlation between the parameter and accuracy
    • Japanese
    • T. Kato, J. Okamoto, and M. Shozakai, "Physical parameter analysis of Japanese speech at automobile driving act, and analysis of correlation between the parameter and accuracy," in Autumn Meeting Acoust. Soc. Jpn. (in Japanese), 2007, vol. 3-Q-25, pp. 267-268.
    • (2007) Autumn Meeting Acoust. Soc. Jpn. , vol.3 Q-25 , pp. 267-268
    • Kato, T.1    Okamoto, J.2    Shozakai, M.3
  • 23
    • 0032668522 scopus 로고    scopus 로고
    • On recent speech corpora activities in japan
    • S. Itahashi, "On recent speech corpora activities in japan," J. Acoust. Soc. Jpn. (E), vol. 20, no. 3, pp. 163-169, 1999.
    • (1999) J. Acoust. Soc. Jpn. (E) , vol.20 , Issue.3 , pp. 163-169
    • Itahashi, S.1
  • 24
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectral subtraction
    • Apr.
    • S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-27 , Issue.2 , pp. 113-120
    • Boll, S.F.1
  • 25
    • 44849131087 scopus 로고    scopus 로고
    • The TITECH large vocabulary WFST speech recognition system
    • P. R. Dixon, D. A. Caseiro, T. Oonishi, and S. Furui, "The TITECH large vocabulary WFST speech recognition system," in Proc. IEEE ASRU, 2007, pp. 443-448.
    • (2007) Proc. IEEE ASRU , pp. 443-448
    • Dixon, P.R.1    Caseiro, D.A.2    Oonishi, T.3    Furui, S.4
  • 26
    • 85128340946 scopus 로고    scopus 로고
    • An efficient two-pass search algorithm using word trellis index
    • A. Lee, T. Kawahara, and S. Doshita, "An efficient two-pass search algorithm using word trellis index," in Proc. ICSLP, 1998, pp. 1831-1834.
    • (1998) Proc. ICSLP , pp. 1831-1834
    • Lee, A.1    Kawahara, T.2    Doshita, S.3
  • 27
    • 0003822743 scopus 로고    scopus 로고
    • Cambridge, U.K.: Cambridge Univ. Eng. Dept.
    • S. Young et al., The HTK Book. Cambridge, U.K.: Cambridge Univ. Eng. Dept., 2005.
    • (2005) The HTK Book
    • Young, S.1
  • 28
    • 0024909979 scopus 로고    scopus 로고
    • Some statistical issues in the comparison of speech recognition algorithms
    • L. Gillick and S. Cox, "Some statistical issues in the comparison of speech recognition algorithms," in Proc. ICASSP, vol. 89, pp. 532-535.
    • Proc. ICASSP , vol.89 , pp. 532-535
    • Gillick, L.1    Cox, S.2
  • 29
    • 15844411850 scopus 로고    scopus 로고
    • Confidence measures for speech recognition: A survey
    • H. Jiang, "Confidence measures for speech recognition: A survey," Speech Commun., vol. 45, no. 4, pp. 455-470, 2005.
    • (2005) Speech Commun. , vol.45 , Issue.4 , pp. 455-470
    • Jiang, H.1
  • 30
    • 0034841730 scopus 로고    scopus 로고
    • Investigating lightly supervised acoustic model training
    • L. Lamel, J. Gauvain, and G. Adda, "Investigating lightly supervised acoustic model training," in Proc. ICASSP, 2001, vol. 1, pp. 477-480.
    • (2001) Proc. ICASSP , vol.1 , pp. 477-480
    • Lamel, L.1    Gauvain, J.2    Adda, G.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.