SCOPUS 정보 검색 플랫폼

IEEE Transactions on Speech and Audio Processing

Volumn 12, Issue 4, 2004, Pages 391-400

Language model and speaking rate adaptation for spontaneous presentation speech recognition

(2) Nanjo, Hiroaki a Kawahara, Tatsuya a

a KYOTO UNIVERSITY (Japan)

Author keywords

Acoustic modeling; Language model adaptation; Pronunciation modeling; Speaking rate; Spontaneous speech recognition

Indexed keywords

ACOUSTICS; CONTEXT FREE GRAMMARS; DATABASE SYSTEMS; DECODING; FORMAL LANGUAGES; HUMAN COMPUTER INTERACTION; MATHEMATICAL MODELS; SPEECH SYNTHESIS;

ACOUSTIC MODELING; LANGUAGE MODEL ADAPTATION; PRONUNCIATION MODELING; SPEAKING RATES; SPONTANEOUS SPEECH RECOGNITION;

SPEECH RECOGNITION;

EID: 3042704466 PISSN: 10636676 EISSN: None Source Type: Journal
DOI: 10.1109/TSA.2004.828641 Document Type: Conference Paper

Times cited : (42)

References (37)

1
- 3042730370
- Recent advances in spontaneous speech recognition and understanding
- S. Furui, "Recent advances in spontaneous speech recognition and understanding," in Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR), 2003, pp. 1-6.
- (2003) Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR) , pp. 1-6
- Furui, S.¹

2
- 3042777118
- Corpus of spontaneous Japanese: Its design and evaluation
- K. Maekawa, "Corpus of spontaneous Japanese: its design and evaluation," in Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR), 2003, pp. 7-12.
- (2003) Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR) , pp. 7-12
- Maekawa, K.¹

3
- 0036298775
- Analysis on individual differences in automatic transcription of spontaneous presentations
- T. Shinozaki and S. Furui, "Analysis on individual differences in automatic transcription of spontaneous presentations," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, 2002, pp. 729-732.
- (2002) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP) , vol.1 , pp. 729-732
- Shinozaki, T.¹ Furui, S.²

4
- 3042854734
- Benchmark test for speech recognition using the Corpus of spontaneous Japanese
- T. Kawahara, H. Nanjo, T. Sinozaki, and S. Furui, "Benchmark test for speech recognition using the Corpus of spontaneous Japanese," in Proc. ISCA and IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR), 2003, pp. 135-138.
- (2003) Proc. ISCA and IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR) , pp. 135-138
- Kawahara, T.¹ Nanjo, H.² Sinozaki, T.³ Furui, S.⁴

5
- 84892146708
- Topic adaptation for language modeling using unnormalized exponential models
- S. F. Chen, K. Seymore, and R. Rosenfeld, "Topic adaptation for language modeling using unnormalized exponential models," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, 1998, pp. 681-684.
- (1998) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP) , vol.2 , pp. 681-684
- Chen, S.F.¹ Seymore, K.² Rosenfeld, R.³

6
- 0032654492
- Improved topic-dependent language modeling using information retrieval techniques
- M. Mahajan, D. Beeferman, and X. D. Huang, "Improved topic-dependent language modeling using information retrieval techniques," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Process. (ICASSP), vol. 1, 1999, pp. 541-544.
- (1999) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Process. (ICASSP) , vol.1 , pp. 541-544
- Mahajan, M.¹ Beeferman, D.² Huang, X.D.³

7
- 0033677215
- Word-level rate of speech modeling using rate-specific phones and pronunciations
- J. Zheng, H. Franco, and F. Weng, "Word-level rate of speech modeling using rate-specific phones and pronunciations," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), 2000, pp. 1775-1778.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP) , pp. 1775-1778
- Zheng, J.¹ Franco, H.² Weng, F.³

8
- 85135173867
- Speech recognition using on-line estimation of speaking rate
- N. Morgan, E. Fosler, and N. Mirghafori, "Speech recognition using on-line estimation of speaking rate," in Proc. European Conf. Speech Communication and Technology (EUROSPEECH), 1997, pp. 2079-2082.
- (1997) Proc. European Conf. Speech Communication and Technology (EUROSPEECH) , pp. 2079-2082
- Morgan, N.¹ Fosler, E.² Mirghafori, N.³

9
- 0034848039
- Duration normalization for improved recognition of spontaneous and read speech via missing feature methods
- J. Nedel and R. Stern, "Duration normalization for improved recognition of spontaneous and read speech via missing feature methods," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, 2001, pp. 313-316.
- (2001) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP) , vol.1 , pp. 313-316
- Nedel, J.¹ Stern, R.²

10
- 33646820149
- Improvements on speech recognition for fast talkers
- M. Richardson, M. Hwang, A. Acero, and X. D. Huang, "Improvements on speech recognition for fast talkers," in Proc. European Conf. Speech Communication and Technology (EUROSPEECH), 1999, pp. 411-414.
- (1999) Proc. European Conf. Speech Communication and Technology (EUROSPEECH) , pp. 411-414
- Richardson, M.¹ Hwang, M.² Acero, A.³ Huang, X.D.⁴

11
- 85027454087
- Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition
- M. Finke and A. Waibel, "Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition," in Proc. European Conf. Speech Communication and Technology (EUROSPEECH), 1997, pp. 2379-2382.
- (1997) Proc. European Conf. Speech Communication and Technology (EUROSPEECH) , pp. 2379-2382
- Finke, M.¹ Waibel, A.²

12
- 0347605862
- Multi-level decision trees for static and dynamic pronunciation models
- E. Fosler-Lussier, "Multi-level decision trees for static and dynamic pronunciation models," in Proc. European Conf. Speech Commun. & Tech. (EUROSPEECH), 1999, pp. 463-466.
- (1999) Proc. European Conf. Speech Commun. & Tech. (EUROSPEECH) , pp. 463-466
- Fosler-Lussier, E.¹

13
- 85118743743
- Statistical language modeling using the CMU-Cambridge toolkit
- P. R. Clarkson and R. Rosenfeld, "Statistical language modeling using the CMU-Cambridge toolkit," in Proc. European Conf. Speech Communication and Technology (EUROSPEECH), 1997, pp. 2707-2710.
- (1997) Proc. European Conf. Speech Communication and Technology (EUROSPEECH) , pp. 2707-2710
- Clarkson, P.R.¹ Rosenfeld, R.²

14
- 3042736433
- Morphological analysis of Corpus of spontaneous Japanese
- K. Uchimoto, C. Nobata, A. Yamada, S. Sekine, and H. Isahara, "Morphological analysis of Corpus of spontaneous Japanese," in Proc. ISCA and IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR), 2003, pp. 159-162.
- (2003) Proc. ISCA and IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR) , pp. 159-162
- Uchimoto, K.¹ Nobata, C.² Yamada, A.³ Sekine, S.⁴ Isahara, H.⁵

15
- 0033721605
- A new phonetic tied-mixture model for efficient decoding
- A. Lee, T. Kawahara, K. Takeda, and K. Shikano, "A new phonetic tied-mixture model for efficient decoding," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), 2000, pp. 1269-1272.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP) , pp. 1269-1272
- Lee, A.¹ Kawahara, T.² Takeda, K.³ Shikano, K.⁴

16
- 85009103112
- Continuous speech recognition consortium - An open repository for CSR tools and models
- A. Lee, T. Kawahara, K. Takeda, M. Mimura, A. Yamada, A. Ito, K. Itou, and K. Shikano, "Continuous speech recognition consortium - an open repository for CSR tools and models," in Proc. Int. Conf. Language Resources and Evaluation (LREC2002), 2002, pp. 1438-1441.
- (2002) Proc. Int. Conf. Language Resources and Evaluation (LREC2002) , pp. 1438-1441
- Lee, A.¹ Kawahara, T.² Takeda, K.³ Mimura, M.⁴ Yamada, A.⁵ Ito, A.⁶ Itou, K.⁷ Shikano, K.⁸

17
- 0032639915
- Improvements in recognition of conversational telephone speech
- B. Peskin, M. Newman, D. McAllaster, V. Nagesha, H. Richards, S. Wegmann, M. Hunt, and L. Gillick, "Improvements in recognition of conversational telephone speech," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, 1999, pp. 53-56.
- (1999) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP) , vol.1 , pp. 53-56
- Peskin, B.¹ Newman, M.² McAllaster, D.³ Nagesha, V.⁴ Richards, H.⁵ Wegmann, S.⁶ Hunt, M.⁷ Gillick, L.⁸

18
- 0033693213
- Efficient integration of multiple pronunciations in a large vocabulary decoder
- H. Schramm and X. Aubert, "Efficient integration of multiple pronunciations in a large vocabulary decoder," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 3, 2000, pp. 1659-1662.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP) , vol.3 , pp. 1659-1662
- Schramm, H.¹ Aubert, X.²

19
- 0033318198
- Improving the performance of a Dutch CSR by modeling within-word and cross-word pronunciation variation
- J. M. Kessens, M. Wester, and H. Strik, "Improving the performance of a Dutch CSR by modeling within-word and cross-word pronunciation variation," Speech Commun., vol. 29, no. 2-4, pp. 193-207, 1999.
- (1999) Speech Commun. , vol.29 , Issue.2-4 , pp. 193-207
- Kessens, J.M.¹ Wester, M.² Strik, H.³

20
- 0030376346
- Improved spontaneous dialogue recognition using dialogue and utterance triggers by adaptive probability boosting
- Philadelphia, PA
- R. R. Sarukkai and D. H. Ballard, "Improved spontaneous dialogue recognition using dialogue and utterance triggers by adaptive probability boosting," in Proc. Int. Conf. Spoken Language Processing (ICSLP), vol. 1, Philadelphia, PA, 1996, pp. 208-211.
- (1996) Proc. Int. Conf. Spoken Language Processing (ICSLP) , vol.1 , pp. 208-211
- Sarukkai, R.R.¹ Ballard, D.H.²

21
- 0030369272
- Modeling long distance dependence in language: Topic mixtures vs. dynamic cache models
- Philadelphia, PA
- R. Iyer and M. Ostendorf, "Modeling long distance dependence in language: Topic mixtures vs. dynamic cache models," in Proc. Int. Conf. Spoken Language Processing (ICSLP), vol. 1, Philadelphia, PA, 1996, pp. 236-239.
- (1996) Proc. Int. Conf. Spoken Language Processing (ICSLP) , vol.1 , pp. 236-239
- Iyer, R.¹ Ostendorf, M.²

22
- 0030715425
- Language model adaptation using mixtures and an exponentially decaying cache
- P. Clarkson and A. J. Robinson, "Language model adaptation using mixtures and an exponentially decaying cache," in Proc. IEEE Int. Conf. Acoust., Speech & Signal Process. (ICASSP), vol. 2, 1997, pp. 799-802.
- (1997) Proc. IEEE Int. Conf. Acoust., Speech & Signal Process. (ICASSP) , vol.2 , pp. 799-802
- Clarkson, P.¹ Robinson, A.J.²

23
- 84962808249
- Automatic transcription of lecture speech using topic-independent language modeling
- K. Kato, H. Nanjo, and T. Kawahara, "Automatic transcription of lecture speech using topic-independent language modeling," in Proc. Int. Conf. Spoken Language Processing (ICSLP), vol. 1, 2000, pp. 162-165.
- (2000) Proc. Int. Conf. Spoken Language Processing (ICSLP) , vol.1 , pp. 162-165
- Kato, K.¹ Nanjo, H.² Kawahara, T.³

24
- 85009083340
- Using information retrieval methods for language model adaptation
- L. Chen, J. L. Gauvain, L. Lamel, G. Adda, and M. Adda, "Using information retrieval methods for language model adaptation," in Proc. Eur. Conf. Speech Communication and Technology (EUROSPEECH), 2001, pp. 255-258.
- (2001) Proc. Eur. Conf. Speech Communication and Technology (EUROSPEECH) , pp. 255-258
- Chen, L.¹ Gauvain, J.L.² Lamel, L.³ Adda, G.⁴ Adda, M.⁵

25
- 85009274873
- Unsupervised language model adaptation for lecture speech transcription
- Denver, CO
- T. Niesler and D. Willett, "Unsupervised language model adaptation for lecture speech transcription," in Proc. Int. Conf. Spoken Language Processing (ICSLP), Denver, CO, 2002, pp. 1413-1416.
- (2002) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 1413-1416
- Niesler, T.¹ Willett, D.²

26
- 85009062702
- Toward automatic transcription of spontaneous presentations
- T. Shinozaki and S. Furui, "Toward automatic transcription of spontaneous presentations," in Proc. European Conf. Speech Commun. & Tech. (EUROSPEECH), 2001, pp. 491-494.
- (2001) Proc. European Conf. Speech Commun. & Tech. (EUROSPEECH) , pp. 491-494
- Shinozaki, T.¹ Furui, S.²

27
- 85009250844
- Speaking rate compensation based on likelihood criterion in acoustic model training and decoding
- K. Okuda, T. Kawahara, and S. Nakamura, "Speaking rate compensation based on likelihood criterion in acoustic model training and decoding," in Proc. Int. Conf. Spoken Language Processing (ICSLP), 2002, pp. 2589-2592.
- (2002) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 2589-2592
- Okuda, K.¹ Kawahara, T.² Nakamura, S.³

28
- 0030376403
- A fast and reliable rate of speech detector
- J. P. Verhasselt and J. P. Martens, "A fast and reliable rate of speech detector," in Proc. Int. Conf. Spoken Language Processing (ICSLP), vol. 4, 1996, pp. 612-615.
- (1996) Proc. Int. Conf. Spoken Language Processing (ICSLP) , vol.4 , pp. 612-615
- Verhasselt, J.P.¹ Martens, J.P.²

29
- 0033692966
- On-line speaking rate estimation using gaussian mixture models
- R. Faltlhauser, T. Pfau, and G. Ruske, "On-line speaking rate estimation using gaussian mixture models," in Proc. IEEE Int. Conf. Acoust., Speech and Signal Processing (ICASSP), vol. III, 2000, pp. 1355-1358.
- (2000) Proc. IEEE Int. Conf. Acoust., Speech and Signal Processing (ICASSP) , vol.3 , pp. 1355-1358
- Faltlhauser, R.¹ Pfau, T.² Ruske, G.³

30
- 84892173311
- Estimating the speaking rate by vowel detection
- T. Pfau and G. Ruske, "Estimating the speaking rate by vowel detection," in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), vol. 2, 1998, pp. 945-948.
- (1998) Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP) , vol.2 , pp. 945-948
- Pfau, T.¹ Ruske, G.²

31
- 84892163293
- Combining multiple estimators of speaking rate
- N. Morgan and E. Fosler-Lussier, "Combining multiple estimators of speaking rate," in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), vol. 2, 1998, pp. 729-732.
- (1998) Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP) , vol.2 , pp. 729-732
- Morgan, N.¹ Fosler-Lussier, E.²

32
- 0033709101
- Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition
- K. Hirose and K. Iwano, "Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition," in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), vol. 3, 2000, pp. 1763-1766.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP) , vol.3 , pp. 1763-1766
- Hirose, K.¹ Iwano, K.²

33
- 3042773553
- Syllable recognition using syllable-segment statistics and syllable-based HMM
- N. Takahashi and S. Nakagawa, "Syllable recognition using syllable-segment statistics and syllable-based HMM," in Proc. Int. Conf. Spoken Language Processing (ICSLP), 2002, pp. 2633-2636.
- (2002) Proc. Int. Conf. Spoken Language Processing (ICSLP) , pp. 2633-2636
- Takahashi, N.¹ Nakagawa, S.²

34
- 3042730377
- Keyword and phrase spotting with heuristic language model
- T. Kawahara, T. Munetsugu, N. Kitaoka, and S. Doshita, "Keyword and phrase spotting with heuristic language model," in Proc. Int. Conf. Spoken Language Processing (ICSLP), vol. 2, 1994, pp. 815-818.
- (1994) Proc. Int. Conf. Spoken Language Processing (ICSLP) , vol.2 , pp. 815-818
- Kawahara, T.¹ Munetsugu, T.² Kitaoka, N.³ Doshita, S.⁴

35
- 85009148152
- Dealing with out-of-vocabulary words and speech disfluencies in an n-gram based speech understanding system
- A. Kai, Y. Hirose, and S. Nakagawa, "Dealing with out-of-vocabulary words and speech disfluencies in an n-gram based speech understanding system," in Proc. Int. Conf. Spoken Language Processing (ICSLP), vol. 6, 1998, pp. 2427-2430.
- (1998) Proc. Int. Conf. Spoken Language Processing (ICSLP) , vol.6 , pp. 2427-2430
- Kai, A.¹ Hirose, Y.² Nakagawa, S.³

36
- 85009070544
- Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition
- H. Nanjo, K. Kato, and T. Kawahara, "Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition," in Proc. European Conf. Speech Communication and Technology (EUROSPEECH), 2001, pp. 2531-2534.
- (2001) Proc. European Conf. Speech Communication and Technology (EUROSPEECH) , pp. 2531-2534
- Nanjo, H.¹ Kato, K.² Kawahara, T.³

37
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech and Lang., vol. 9, no. 2, pp. 171-185, 1995.
- (1995) Comput. Speech and Lang. , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.