메뉴 건너뛰기




Volumn 18, Issue 6, 2010, Pages 1539-1549

Statistical transformation of language and pronunciation models for spontaneous speech recognition

Author keywords

Automatic speech recognition (ASR); Language model (LM); Pronunciation model; Spontaneous speech; Statistical transformation

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; LANGUAGE MODEL; PRONUNCIATION MODEL; SPONTANEOUS SPEECH; STATISTICAL TRANSFORMATION;

EID: 77955729683     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2009.2037400     Document Type: Article
Times cited : (22)

References (32)
  • 2
    • 0034847002 scopus 로고    scopus 로고
    • The 1998 HTK system for transcription of conversational telephone speech
    • T. Hain, P. Woodland, T. Niesler, and E. Whittaker, "The 1998 HTK system for transcription of conversational telephone speech," in Proc. ICASSP, 1999, pp. 57-60.
    • (1999) Proc. ICASSP , pp. 57-60
    • Hain, T.1    Woodland, P.2    Niesler, T.3    Whittaker, E.4
  • 4
    • 44849090969 scopus 로고    scopus 로고
    • Recognition and understanding of meetings: The AMI and AMIDA projects
    • S. Renals, T. Hain, and H. Bourlard, "Recognition and understanding of meetings: The AMI and AMIDA projects," in Proc. ASRU, 2007, pp. 238-247.
    • (2007) Proc. ASRU , pp. 238-247
    • Renals, S.1    Hain, T.2    Bourlard, H.3
  • 5
    • 85009067726 scopus 로고    scopus 로고
    • Toward the realization of spontaneous speech recognition-Introduction of a Japanese priority program and preliminary results
    • S. Furui, K. Maekawa, and H. Isahara, "Toward the realization of spontaneous speech recognition-Introduction of a Japanese priority program and preliminary results," in Proc. ICSLP, 2000, pp. 518-521.
    • (2000) Proc. ICSLP , pp. 518-521
    • Furui, S.1    Maekawa, K.2    Isahara, H.3
  • 6
    • 0141591531 scopus 로고    scopus 로고
    • Language modeling and transcription of the TED corpus lectures
    • E. Leeuwis, M. Federico, and M. Cettolo, "Language modeling and transcription of the TED corpus lectures," in Proc. ICASSP, 2003, pp. 232-235.
    • (2003) Proc. ICASSP , pp. 232-235
    • Leeuwis, E.1    Federico, M.2    Cettolo, M.3
  • 9
    • 51449113481 scopus 로고    scopus 로고
    • Automatic lecture transcription by exploiting presentation slide information for language model adaptation
    • T. Kawahara, Y. Nemoto, and Y. Akita, "Automatic lecture transcription by exploiting presentation slide information for language model adaptation," in Proc. ICASSP, 2008, pp. 4929-4932.
    • (2008) Proc. ICASSP , pp. 4929-4932
    • Kawahara, T.1    Nemoto, Y.2    Akita, Y.3
  • 10
  • 14
    • 4544316882 scopus 로고    scopus 로고
    • Advances in the automatic transcription of lectures
    • M. Cettolo, F. Brugnara, and M. Federico, "Advances in the automatic transcription of lectures," in Proc. ICASSP, 2004, pp. 769-772.
    • (2004) Proc. ICASSP , pp. 769-772
    • Cettolo, M.1    Brugnara, F.2    Federico, M.3
  • 15
    • 85044611587 scopus 로고
    • The mathematics of statistical machine translation: Parameter estimation
    • P. Brown, S. Pietra, V. Pietra, and R. Mercer, "The mathematics of statistical machine translation: Parameter estimation," Comput. Linguist., vol.19, no.2, pp. 263-311, 1993.
    • (1993) Comput. Linguist. , vol.19 , Issue.2 , pp. 263-311
    • Brown, P.1    Pietra, S.2    Pietra, V.3    Mercer, R.4
  • 17
    • 34547522348 scopus 로고    scopus 로고
    • Reconstructing medical dictations from automatically recognized and non-literal transcripts with phonetic similarity matching
    • S. Petrik and G. Kubin, "Reconstructing medical dictations from automatically recognized and non-literal transcripts with phonetic similarity matching," in Proc. ICASSP, 2007, vol.4, pp. 1125-1128.
    • (2007) Proc. ICASSP , vol.4 , pp. 1125-1128
    • Petrik, S.1    Kubin, G.2
  • 18
    • 0141480041 scopus 로고    scopus 로고
    • Language model adaptation using WFST-based speaking-style translation
    • T. Hori, D.Willett, and Y. Minami, "Language model adaptation using WFST-based speaking-style translation," in Proc. ICASSP, 2003, vol.1, pp. 228-231.
    • (2003) Proc. ICASSP , vol.1 , pp. 228-231
    • Hori, T.1    Willett, D.2    Minami, Y.3
  • 19
    • 4043075534 scopus 로고    scopus 로고
    • Extended models and tools for high-performance part-of-speech tagger
    • M. Asahara and Y. Matsumoto, "Extended models and tools for high-performance part-of-speech tagger," in Proc. COLING, 2000, pp. 21-27.
    • (2000) Proc. COLING , pp. 21-27
    • Asahara, M.1    Matsumoto, Y.2
  • 20
    • 0002652285 scopus 로고    scopus 로고
    • A maximum entropy approach to natural language processing
    • A. Berger, V. Della Pietra, and S. Della Pietra, "A maximum entropy approach to natural language processing," Comput. Linguist., vol.22, no.1, pp. 39-71, 1996.
    • (1996) Comput. Linguist. , vol.22 , Issue.1 , pp. 39-71
    • Berger, A.1    Della Pietra, V.2    Della Pietra, S.3
  • 21
    • 0030351374 scopus 로고    scopus 로고
    • On designing pronunciation lexicons for large vocabulary, continuous speech recognition
    • L. Lamel and G. Adda, "On designing pronunciation lexicons for large vocabulary, continuous speech recognition," in Proc. ICSLP, 1996, pp. 6-9.
    • (1996) Proc. ICSLP , pp. 6-9
    • Lamel, L.1    Adda, G.2
  • 22
    • 0030363039 scopus 로고    scopus 로고
    • Dictionary learning for spontaneous speech recognition
    • T. Sloboda and A.Waibel, "Dictionary learning for spontaneous speech recognition," in Proc. ICSLP, 1996, pp. 2328-2331.
    • (1996) Proc. ICSLP , pp. 2328-2331
    • Sloboda, T.1    Waibel, A.2
  • 23
    • 3042704466 scopus 로고    scopus 로고
    • Language model and speaking rate adaptation for spontaneous presentation speech recognition
    • Jul.
    • H. Nanjo and T. Kawahara, "Language model and speaking rate adaptation for spontaneous presentation speech recognition," IEEE Trans. Speech Audio Process., vol.12, no.4, pp. 391-400, Jul. 2004.
    • (2004) IEEE Trans. Speech Audio Process. , vol.12 , Issue.4 , pp. 391-400
    • Nanjo, H.1    Kawahara, T.2
  • 25
    • 0033077780 scopus 로고    scopus 로고
    • Automatic generation of multiple pronunciations based on neural networks
    • T. Fukada, T. Yoshimura, and Y. Sagisaka, "Automatic generation of multiple pronunciations based on neural networks," Speech Communication, vol.27, pp. 63-73, 1999.
    • (1999) Speech Communication , vol.27 , pp. 63-73
    • Fukada, T.1    Yoshimura, T.2    Sagisaka, Y.3
  • 26
    • 0030672090 scopus 로고    scopus 로고
    • Automatic alternative transcription generation and vocabulary selection for flexible word recognizers
    • D. Torre, L. Villarrubia, J. Elvira, and L. Hernandez-Gomez, "Automatic alternative transcription generation and vocabulary selection for flexible word recognizers," in Proc. ICASSP, 1997, pp. 1463-1466.
    • (1997) Proc. ICASSP , pp. 1463-1466
    • Torre, D.1    Villarrubia, L.2    Elvira, J.3    Hernandez-Gomez, L.4
  • 29
    • 0029725604 scopus 로고    scopus 로고
    • A parametric approach to vocal tract length normalization
    • E. Eide and H. Gish, "A parametric approach to vocal tract length normalization," in Proc. ICASSP, 1996, vol.1, pp. 346-349.
    • (1996) Proc. ICASSP , vol.1 , pp. 346-349
    • Eide, E.1    Gish, H.2
  • 30
    • 0029747183 scopus 로고    scopus 로고
    • Speaker normalization using efficient frequency warping procedures
    • L. Lee and R. Rose, "Speaker normalization using efficient frequency warping procedures," in Proc. ICASSP, 1996, vol.1, pp. 353-356.
    • (1996) Proc. ICASSP , vol.1 , pp. 353-356
    • Lee, L.1    Rose, R.2
  • 31
    • 0036296863 scopus 로고    scopus 로고
    • Minimum phone error and I-smoothing for improved discriminative training
    • D. Povey and P. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. ICASSP, 2002, vol.1, pp. 105-108.
    • (2002) Proc. ICASSP , vol.1 , pp. 105-108
    • Povey, D.1    Woodland, P.2
  • 32
    • 33646809034 scopus 로고    scopus 로고
    • Generalized statistical modeling of pronunciation variations using variable-length phone context
    • Y. Akita and T. Kawahara, "Generalized statistical modeling of pronunciation variations using variable-length phone context," in Proc. ICASSP, 2005, vol.1, pp. 689-692.
    • (2005) Proc. ICASSP , vol.1 , pp. 689-692
    • Akita, Y.1    Kawahara, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.