메뉴 건너뛰기




Volumn 48, Issue 5, 2006, Pages 532-548

Boosting HMM acoustic models in large vocabulary speech recognition

Author keywords

Acoustic model training; AdaBoost; Automatic speech recognition; Boosting; Machine learning; Spontaneous speech

Indexed keywords

ALGORITHMS; CLASSIFICATION (OF INFORMATION); DECODING; LEARNING SYSTEMS; MARKOV PROCESSES; SPEECH RECOGNITION; VOCABULARY CONTROL;

EID: 33645989784     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2005.09.009     Document Type: Article
Times cited : (23)

References (34)
  • 1
    • 33646015592 scopus 로고    scopus 로고
    • Aubert, X., 1999. One pass cross word decoding for large vocabularies based on a lexical tree search organization. In: Proc. EUROSPEECH-99, Budapest, Hungary, pp. 1559-1562.
  • 2
    • 85009112413 scopus 로고    scopus 로고
    • Aubert, X., Blasig, R., 2000. Combined acoustic and linguistic look-ahead for one-pass time-synchronous decoding. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP-00), Vol. 3, Beijing, China, pp. 802-805.
  • 3
    • 0022890536 scopus 로고    scopus 로고
    • Bahl, L.R., Brown, P.F., de Souza, P.V., Mercer, R.L., 1986. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-86), Tokyo, pp. 49-52.
  • 4
    • 85009143785 scopus 로고    scopus 로고
    • Beyerlein, P., Aubert, X., Harris, M., Meyer, C., Schramm, H., 2001. Investigations on conversational speech recognition. In: Proc. EUROSPEECH-01, Aalborg, Denmark, pp. 499-503.
  • 5
    • 0036567736 scopus 로고    scopus 로고
    • Large vocabulary continuous speech recognition of broadcast news-The Philips/RWTH approach
    • Beyerlein P., Aubert X., Haeb-Umbach R., Harris M., et al. Large vocabulary continuous speech recognition of broadcast news-The Philips/RWTH approach. Speech Commun. 37 (2002) 109-137
    • (2002) Speech Commun. , vol.37 , pp. 109-137
    • Beyerlein, P.1    Aubert, X.2    Haeb-Umbach, R.3    Harris, M.4
  • 6
    • 33645996595 scopus 로고    scopus 로고
    • Collins, M., 2000. Discriminative reranking for natural language parsing. In: Proc. Seventeenth Internat. Conf. on Machine Learning (ICML-00), Stanford, USA.
  • 7
    • 33645964182 scopus 로고    scopus 로고
    • Collins, M. 2002. Ranking algorithms for named-entity extraction: boosting and the voted perceptron. In: Proc. ACL 2002.
  • 8
    • 0030351194 scopus 로고    scopus 로고
    • Cook, G.D., Robinson, A.J., 1996. Boosting the performance of connectionist large vocabulary speech recognition. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP-96), Philadelphia, PA, USA, pp. 1305-1308.
  • 9
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis S.B., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28 (1980) 357-366
    • (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.28 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 10
    • 4544236424 scopus 로고    scopus 로고
    • Dimitrakakis, D., Bengio, S., 2004. Boosting HMMs with an application to speech recognition. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-04), Montreal, Canada.
  • 11
    • 0031211090 scopus 로고    scopus 로고
    • A decision-theoretic generalization of on-line learning and an application to boosting
    • Freund Y., and Schapire R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55 (1997) 119-139
    • (1997) J. Comput. Syst. Sci. , vol.55 , pp. 119-139
    • Freund, Y.1    Schapire, R.E.2
  • 12
    • 85016587886 scopus 로고    scopus 로고
    • Godfrey, J., Holliman, E., McDaniel, J., 1992. SWITCHBOARD: Telephone speech corpus for Research and Development. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-92), San Francisco, CA, USA.
  • 13
    • 0026982122 scopus 로고
    • Discriminative learning for minimum error classification
    • Juang B.H., and Katagiri S. Discriminative learning for minimum error classification. IEEE Trans. Signal Process. 40 (1992) 3043-3054
    • (1992) IEEE Trans. Signal Process. , vol.40 , pp. 3043-3054
    • Juang, B.H.1    Katagiri, S.2
  • 14
    • 0036104545 scopus 로고    scopus 로고
    • Empirical margin distributions and bounding the generalization error of combined classifiers
    • Koltchinskii V., and Panchenko D. Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30 1 (2002)
    • (2002) Ann. Statist. , vol.30 , Issue.1
    • Koltchinskii, V.1    Panchenko, D.2
  • 15
    • 0036293851 scopus 로고    scopus 로고
    • Meyer, C., 2002. Utterance-level boosting of HMM speech recognizers. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-02), Orlando, FL, USA, pp. 109-112.
  • 16
    • 33645961198 scopus 로고    scopus 로고
    • Meyer, C., Beyerlein, P., 2002. Towards "Large Margin" speech recognizers by boosting and discriminative training. In: Proc. Nineteenth Internat. Conf. on Machine Learning (ICML-02), Sydney, Australia, pp. 419-426.
  • 17
    • 85009100592 scopus 로고    scopus 로고
    • Meyer, C., Rose, G., 2000. Rival training: efficient use of data in discriminative training. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP-00), Beijing, pp. 632-635.
  • 18
    • 10444240175 scopus 로고    scopus 로고
    • Meyer, C., Schramm, H., 2004. Boosting acoustic models in large vocabulary speech recognition. In: Proc. Sixth IASTED Internat. Conf. on Signal and Image Processing (SIP-2004), Honolulu, Hawaii, USA.
  • 19
    • 33645967081 scopus 로고    scopus 로고
    • Peters, J. 2003. LM studies on filled pauses in spontaneous medical dictation. In: Proc. Human Language Technology Conf. (HLT-NAACL 2003), short papers, Edmonton, Alberta, Canada, pp. 82-84.
  • 20
    • 84955368642 scopus 로고    scopus 로고
    • Raetsch, G., 2003. Robust multi-class boosting. In: Proc. EUROSPEECH-03, Geneva, Switzerland, pp. 997-1000.
  • 21
    • 33645969809 scopus 로고    scopus 로고
    • Rüber, B., 1997. Obtaining confidence measures from sentence probabilities. In: Proc. EUROSPEECH-97, Rhodes, Greece, pp. 739-742.
  • 22
    • 0025448521 scopus 로고
    • The strength of weak learnability
    • Schapire R.E. The strength of weak learnability. Mach. Learn. 5 (1990) 197-227
    • (1990) Mach. Learn. , vol.5 , pp. 197-227
    • Schapire, R.E.1
  • 24
    • 0033281701 scopus 로고    scopus 로고
    • Improved Boosting Algorithms Using Confidence-rated Predictions
    • Schapire R.E., and Singer Y. Improved Boosting Algorithms Using Confidence-rated Predictions. Mach. Learn. 37 3 (1999) 297-336
    • (1999) Mach. Learn. , vol.37 , Issue.3 , pp. 297-336
    • Schapire, R.E.1    Singer, Y.2
  • 25
    • 0032280519 scopus 로고    scopus 로고
    • Boosting the margin: a new explanation of the effectiveness of voting methods
    • Schapire R.E., Freund Y., Bartlett P., and Lee W.S. Boosting the margin: a new explanation of the effectiveness of voting methods. Ann. Statist. 26 (1998) 1651-1686
    • (1998) Ann. Statist. , vol.26 , pp. 1651-1686
    • Schapire, R.E.1    Freund, Y.2    Bartlett, P.3    Lee, W.S.4
  • 26
    • 33646001009 scopus 로고    scopus 로고
    • Schlüter, R., Müller, R., Wessel, F., Ney, H., 1999. Interdependence of language models and discriminative training. In: Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU-99), Keystone, Colorado, pp. 119-122.
  • 27
    • 0033693213 scopus 로고    scopus 로고
    • Schramm, H., Aubert, X., 2000. Efficient integration of multiple pronunciations in a large vocabulary decoder. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-00), Istanbul, Turkey, Vol. 3, pp. 1659-1662.
  • 28
    • 33645958171 scopus 로고    scopus 로고
    • Schramm, H., Aubert, X., Meyer, C., Peters, J., 2003. Filled-pause modeling for medical transcriptions. In: Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR 2003), Tokyo, Japan, pp. 143-146.
  • 29
    • 0032639912 scopus 로고    scopus 로고
    • Schwenk, H., 1999. Using boosting to improve a Hybrid HMM/Neural Network speech recognizer. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-99), Phoenix, AZ, pp. 1009-1012.
  • 30
    • 0021518106 scopus 로고
    • A theory of the learnable
    • Valiant L.G. A theory of the learnable. Commun. ACM 27 11 (1984) 1134-1142
    • (1984) Commun. ACM , vol.27 , Issue.11 , pp. 1134-1142
    • Valiant, L.G.1
  • 31
    • 33645991416 scopus 로고    scopus 로고
    • Woodland, P.C., Povey, D., 2000. Large scale discriminative training for speech recognition. In: Proc. ASR 2000 Conference-Automatic Speech Recognition: Challenges for the new Millenium, Paris, France, pp. 7-16.
  • 32
    • 85009198106 scopus 로고    scopus 로고
    • Zhang, R., Rudnicky, A.I., 2003. Comparative study of boosting and non-boosting training for constructing ensembles of acoustic models. In: Proc. EUROSPEECH-03, Geneva, Switzerland, Vol. III, pp. 1885-1888.
  • 33
    • 0033677215 scopus 로고    scopus 로고
    • Zheng, J., Franco, H., Weng, F., Sankar, A., Bratt, H., 2000. Word-level rate of speech modeling using rate-specific phones and pronunciations. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-00), Istanbul, Turkey, Vol. III, pp. 1775-1778.
  • 34
    • 0033693373 scopus 로고    scopus 로고
    • Zweig, G., Padmanabhan, M., 2000. Boosting Gaussian mixtures in an LVCSR system. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP-00), Istanbul, Turkey, pp. 1527-1530.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.