SCOPUS 정보 검색 플랫폼

Volumn 46, Issue 2, 2005, Pages 171-188

Implicit modelling of pronunciation variation in automatic speech recognition

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

Acoustic modelling; Automatic speech recognition; Conversational speech recognition; Hidden markov models; Parameter tying; Phonetic decision trees; Pronunciation dictionaries; Pronunciation modelling; Single pronunciations; State clustering

Indexed keywords

DECISION THEORY; MARKOV PROCESSES; MATHEMATICAL MODELS; OPTIMIZATION; SPEECH ANALYSIS; TREES (MATHEMATICS);

ACOUSTIC MODELLING; AUTOMATIC SPEECH RECOGNITION (ASR); CONVERSATIONAL SPEECH RECOGNITION; HIDDEN MARKOV MODELS; PARAMMETER TYPING; PHONETIC DECISION TREES; PRONUNCIATION DICTIONARIES; PRONUNCIATION MODELLING; SINGLE PRONUNCIATIONS; STATE CLUSTERING;

SPEECH RECOGNITION;

EID: 19944415893 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2005.03.008 Document Type: Conference Paper

Times cited : (33)

References (34)

1
- 0026400221
- A new class of fenonic Markov models for large vocabulary continuous speech recognition
- Bahl, L.R., Bellegarda, J.R., de Souza, P.V., Gopalakrishnan, P.S., Nahamoo, D., Picheny, M.A., 1991. A new class of fenonic Markov models for large vocabulary continuous speech recognition. In: Proceedings of ICASSP'91, Vol. 1. pp. 177-200.
- (1991) Proceedings of ICASSP'91 , vol.1 , pp. 177-200
- Bahl, L.R.¹ Bellegarda, J.R.² De Souza, P.V.³ Gopalakrishnan, P.S.⁴ Nahamoo, D.⁵ Picheny, M.A.⁶

2
- 33646779368
- Modelling pronunciation variation in modelling speech using prosody
- Bates, R., Ostendorf, M., 2002. Modelling pronunciation variation in modelling speech using prosody. In: Proceedings of ITRW Pronunciation Modelling and Lexicon Adaptation Workshop. pp. 42-47.
- (2002) Proceedings of ITRW Pronunciation Modelling and Lexicon Adaptation Workshop , pp. 42-47
- Bates, R.¹ Ostendorf, M.²

3
- 0025629882
- Tied mixture continuous parameter modeling for speech recognition
- J.R. Bellegarda, and D. Nahamoo Tied mixture continuous parameter modeling for speech recognition IEEE Trans. ASSP 38 12 1990 2033 2045
- (1990) IEEE Trans. ASSP , vol.38 , Issue.12 , pp. 2033-2045
- Bellegarda, J.R.¹ Nahamoo, D.²

4
- 0031624621
- Pronunciation modelling using a hand-labelled corpus for conversational speech recognition
- Byrne, W., Finke, M., Khudanpur, S., McDonough, J., Nock, H.J., Riley, M., Saraçclar, M., Wooters, C., Zavaliagkos, G., 1998. Pronunciation modelling using a hand-labelled corpus for conversational speech recognition. In: Proceedings of ICASSP'98, Vol. 1. pp. 313-316.
- (1998) Proceedings of ICASSP'98 , vol.1 , pp. 313-316
- Byrne, W.¹ Finke, M.² Khudanpur, S.³ McDonough, J.⁴ Nock, H.J.⁵ Riley, M.⁶ Saraçclar, M.⁷ Wooters, C.⁸ Zavaliagkos, G.⁹

5
- 0343367219
- Automatic rule-based generation of word pronunciation networks
- Cremelie, N., Martens, J.-P., 1997. Automatic rule-based generation of word pronunciation networks. In: Proceedings of EUROSPEECH'97. pp. 2459-2462.
- (1997) Proceedings of EUROSPEECH'97 , pp. 2459-2462
- Cremelie, N.¹ Martens, J.-P.²

6
- 85027454087
- Speaking mode dependent pronunciation modelling in large vocabulary continuous speech recognition
- Rhodes
- Finke, M., Waibel, A., 1997. Speaking mode dependent pronunciation modelling in large vocabulary continuous speech recognition. In: Proceedings of EUROSPEECH' 97, Vol. 5. Rhodes, pp. 2379-2382.
- (1997) Proceedings of EUROSPEECH' 97 , vol.5 , pp. 2379-2382
- Finke, M.¹ Waibel, A.²

7
- 0043272135
- Automatic learning of word pronunciation from data
- Fosler, E., Weintraub, M., Wegmann, S., Kao, Y.-H., Khudanpur, S., Galles, C., Saraçclar, M., 1996. Automatic learning of word pronunciation from data. In: Proceedings of ICSLP'96.
- (1996) Proceedings of ICSLP'96
- Fosler, E.¹ Weintraub, M.² Wegmann, S.³ Kao, Y.-H.⁴ Khudanpur, S.⁵ Galles, C.⁶ Saraçclar, M.⁷

8
- 0003245997
- The LIMSI Nov93 WSJ system
- March Plainsboro, NJ
- Gauvain, J.-L., Lamel, L.F., Adda, G., Adda-Decker, M., March 1994. The LIMSI Nov93 WSJ system. In: Proceedings of 1994 ARPA Spoken Language Technology Workshop. Plainsboro, NJ, pp. 125-128.
- (1994) Proceedings of 1994 ARPA Spoken Language Technology Workshop , pp. 125-128
- Gauvain, J.-L.¹ Lamel, L.F.² Adda, G.³ Adda-Decker, M.⁴

9
- 0342931765
- The switchboard transcription project
- Center for Language and Speech Processing, Johns Hopkins University
- Greenberg, S., 1996. The Switchboard transcription project. 1996 LVCSR summer workshop technical reports, Center for Language and Speech Processing, Johns Hopkins University. Available from: < http://www.icsi.berkeley.edu/ real/stp>.
- (1996) 1996 LVCSR Summer Workshop Technical Reports
- Greenberg, S.¹

10
- 0012588925
- Speaking in shorthand-a syllable-centric perspective for understanding pronunciation variation
- Kerkrade, Netherlands
- Greenberg, S., 1998. Speaking in shorthand-a syllable-centric perspective for understanding pronunciation variation. In: Proceedings of ESCA Workshop on modelling pronunciation variation for automatic speech recognition. Kerkrade, Netherlands, pp. 47-56.
- (1998) Proceedings of ESCA Workshop on Modelling Pronunciation Variation for Automatic Speech Recognition , pp. 47-56
- Greenberg, S.¹

11
- 19944397218
- Ph.D. thesis, Cambridge University
- Hain, T., 2001. Hidden model sequence models for automatic speech recognition. Ph.D. thesis, Cambridge University.
- (2001) Hidden Model Sequence Models for Automatic Speech Recognition
- Hain, T.¹

12
- 85135269907
- Dynamic HMM selection for continuous speech recognition
- September
- Hain, T., Woodland, P.C., September 1999. Dynamic HMM selection for continuous speech recognition. In: Proceedings of EUROSPEECH'99, Vol. 3. pp. 1327-1330.
- (1999) Proceedings of EUROSPEECH'99 , vol.3 , pp. 1327-1330
- Hain, T.¹ Woodland, P.C.²

13
- 0034847002
- The 1998 HTK system for transcription of conversational telephone speech
- April
- Hain, T., Woodland, P.C., Niesler, T.R., Whittaker, E.W.D., April 1999. The 1998 HTK system for transcription of conversational telephone speech. In: Proceedings of ICASSP'99. pp. 57-60.
- (1999) Proceedings of ICASSP'99 , pp. 57-60
- Hain, T.¹ Woodland, P.C.² Niesler, T.R.³ Whittaker, E.W.D.⁴

14
- 0012236195
- The CU HTK March 2000 Hub5 transcription system
- May College Park, Maryland
- Hain, T., Woodland, P.C., Evermann, G., Povey, D., May 2000. The CU HTK March 2000 Hub5 transcription system. In: Proceedings of 2000 NIST Speech Transcription Workshop. College Park, Maryland.
- (2000) Proceedings of 2000 NIST Speech Transcription Workshop
- Hain, T.¹ Woodland, P.C.² Evermann, G.³ Povey, D.⁴

15
- 0034847002
- New features in the cu-htk system for transcription of conversational telephone speech
- Hain, T., Woodland, P.C., Evermann, G., Povey, D., 2001. New features in the cu-htk system for transcription of conversational telephone speech. In: Proceedings of ICASSP'01. pp. 57-60.
- (2001) Proceedings of ICASSP'01 , pp. 57-60
- Hain, T.¹ Woodland, P.C.² Evermann, G.³ Povey, D.⁴

16
- 0000250399
- Semi-continuous hidden Markov models for speech signals
- X.D. Huang, and M.A. Jack Semi-continuous hidden Markov models for speech signals Computer Speech and Language 3 1989 239 251
- (1989) Computer Speech and Language , vol.3 , pp. 239-251
- Huang, X.D.¹ Jack, M.A.²

17
- 0009589469
- October Ph.D. thesis, Cambridge University
- Humphries, J.J., October 1997. Accent modelling and adaptation in automatic apeech recognition. Ph.D. thesis, Cambridge University.
- (1997) Accent Modelling and Adaptation in Automatic Apeech Recognition
- Humphries, J.J.¹

18
- 0027153655
- Predicting unseen triphones with senones
- Hwang, M.-Y., Huang, X., Alleva, F., 1993. Predicting unseen triphones with senones. In: Proceedings of ICASSP'93, Vol. 2. pp. 311-314.
- (1993) Proceedings of ICASSP'93 , vol.2 , pp. 311-314
- Hwang, M.-Y.¹ Huang, X.² Alleva, F.³

19
- 0002237531
- Probabilistic classification of HMM states for large vocabulary continuous speech recognition
- April
- Luo, X., Jelinek, F., April 1999. Probabilistic classification of HMM states for large vocabulary continuous speech recognition. In: Proceedings of ICASSP'99. pp. 2044-2047.
- (1999) Proceedings of ICASSP'99 , pp. 2044-2047
- Luo, X.¹ Jelinek, F.²

20
- 19944392349
- BBN pronunciation modelling
- MITAGS, Linthicum Heights, Maryland
- Ma, K., Zavaliagkos, G., Iyer, R., 1998. BBN pronunciation modelling. Presented at the 9th Conversational Speech Recognition Workshop, MITAGS, Linthicum Heights, Maryland.
- (1998) 9th Conversational Speech Recognition Workshop
- Ma, K.¹ Zavaliagkos, G.² Iyer, R.³

21
- 0342497541
- Detecting and correcting poor pronunciations for multiword units
- Kerkrade, Netherlands
- Nock, H.J., Young, S.J., 1998. Detecting and correcting poor pronunciations for multiword units. In: Proceedings of ESCA Workshop on modelling pronunciation variation for automatic speech recognition. Kerkrade, Netherlands, pp. 85-90.
- (1998) Proceedings of ESCA Workshop on Modelling Pronunciation Variation for Automatic Speech Recognition , pp. 85-90
- Nock, H.J.¹ Young, S.J.²

22
- 4544293504
- Moving beyond the "beads-on-a-String" model of speech
- Ostendorf, M., 1999. Moving beyond the "Beads-on-a-String" model of speech. In: Proceedings of 1999 IEEE Workshop on Automatic Speech Recognition and Understanding, Vol. 1. pp. 79-83.
- (1999) Proceedings of 1999 IEEE Workshop on Automatic Speech Recognition and Understanding , vol.1 , pp. 79-83
- Ostendorf, M.¹

23
- 0012316245
- 1994 benchmark tests for the ARPA spoken language program
- Pallett, D., Fiscus, J.G., Fisher, W., Garofolo, J.S., Lund, B.A., Martin, A., 1995. 1994 benchmark tests for the ARPA spoken language program. In: Proceedings of ARPA Workshop on Spoken Language Systems Technology. pp. 3-5.
- (1995) Proceedings of ARPA Workshop on Spoken Language Systems Technology , pp. 3-5
- Pallett, D.¹ Fiscus, J.G.² Fisher, W.³ Garofolo, J.S.⁴ Lund, B.A.⁵ Martin, A.⁶

24
- 0009953947
- Theory and practice of acoustic confusability
- Printz, H., Olsen, P., 2000. Theory and practice of acoustic confusability. In: Proceedings of ISCA ITRW ASR2000 Workshop.
- (2000) Proceedings of ISCA ITRW ASR2000 Workshop
- Printz, H.¹ Olsen, P.²

25
- 0033353288
- Stochastic pronunciation modelling from handlabelled phonetic corpora
- M. Riley, W. Byrne, M. Finke, S. Khudanpur, A. Lolje, J. McDonough, H.J. Nock, M. Saraçclar, C. Wooters, and G. Zavaliagkos Stochastic pronunciation modelling from handlabelled phonetic corpora Speech Communication 29 1999 209 224
- (1999) Speech Communication , vol.29 , pp. 209-224
- Riley, M.¹ Byrne, W.² Finke, M.³ Khudanpur, S.⁴ Lolje, A.⁵ McDonough, J.⁶ Nock, H.J.⁷ Saraçclar, M.⁸ Wooters, C.⁹ Zavaliagkos, G.¹⁰

26
- 0000114416
- Pronunciation modelling by sharing Gaussian densities across phonetic models
- M. Saraçlar, H.J. Nock, and S. Khudanpur Pronunciation modelling by sharing Gaussian densities across phonetic models Computer Speech and Language 14 2000 137 160
- (2000) Computer Speech and Language , vol.14 , pp. 137-160
- Saraçlar, M.¹ Nock, H.J.² Khudanpur, S.³

27
- 0037519295
- The SRI March 2000 Hub-5 conversational speech transcription system
- May College Park, Maryland
- Stolcke, A., Bratt, H., Butzberger, J., Franco, H., Gadde, V.R.R., Plauche, M., Richey, C., Shriberg, E., Sonmez, K., Weng, F.-L., Zhen, J., May 2000. The SRI March 2000 Hub-5 conversational speech transcription system. In: Proceedings of 2000 NIST Speech Transcription Workshop. College Park, Maryland. URL Available from: < http://www.nist.gov/speech/publications/tw00/pdf/cts80. pdf>.
- (2000) Proceedings of 2000 NIST Speech Transcription Workshop
- Stolcke, A.¹ Bratt, H.² Butzberger, J.³ Franco, H.⁴ Gadde, V.R.R.⁵ Plauche, M.⁶ Richey, C.⁷ Shriberg, E.⁸ Sonmez, K.⁹ Weng, F.-L.¹⁰ Zhen, J.¹¹

28
- 0033335618
- Modelling pronunciation variation for ASR: A survey of the literature
- H. Strik, and C. Cucchiarini Modelling pronunciation variation for ASR: a survey of the literature Speech Communication 29 1999 225 246
- (1999) Speech Communication , vol.29 , pp. 225-246
- Strik, H.¹ Cucchiarini, C.²

29
- 0043086491
- Effect of speaking style on LVCSR performance
- Weintraub, M., Taussig, K., Hunicke-Smith, K., Snodgrass, A., 1996. Effect of speaking style on LVCSR performance. In: Proceedings of ICSLP'96. pp. S16-S19.
- (1996) Proceedings of ICSLP'96
- Weintraub, M.¹ Taussig, K.² Hunicke-Smith, K.³ Snodgrass, A.⁴

30
- 0001393274
- The development of the 1994 HTK large vocabulary speech recognition system
- Woodland, P.C., Odell, J.J., Valtchev, V., Young, S.J., 1995. The development of the 1994 HTK large vocabulary speech recognition system. In: Proceedings of ARPA Workshop on Spoken Language Systems Technology. pp. 104-105.
- (1995) Proceedings of ARPA Workshop on Spoken Language Systems Technology , pp. 104-105
- Woodland, P.C.¹ Odell, J.J.² Valtchev, V.³ Young, S.J.⁴

31
- 19944398853
- Vienna, VA
- Woodland, P.C., Evermann, G., Gales, M.J., Hain, T., Liu, A., Moore, G., Povey, D., Wang, L., 2002. CU-HTK APRIL 2002 Switchboard System, rich Transcription Workshop, Vienna, VA.
- (2002) CU-HTK APRIL 2002 Switchboard System, Rich Transcription Workshop
- Woodland, P.C.¹ Evermann, G.² Gales, M.J.³ Hain, T.⁴ Liu, A.⁵ Moore, G.⁶ Povey, D.⁷ Wang, L.⁸

32
- 0343367122
- Multiple-pronunciation lexical modeling in a speaker-independent speech understanding system
- Wooters, C., Stolcke, A., 1994. Multiple-pronunciation lexical modeling in a speaker-independent speech understanding system. In: Proceedings of ICSLP'94, Vol. 3. pp. 1363-1367.
- (1994) Proceedings of ICSLP'94 , vol.3 , pp. 1363-1367
- Wooters, C.¹ Stolcke, A.²

33
- 0028530231
- State clustering in hidden Markov model-based continuous speech recognition
- S.J. Young, and P.C Woodland State clustering in hidden Markov model-based continuous speech recognition Computer Speech and Language 8 1994 369 383
- (1994) Computer Speech and Language , vol.8 , pp. 369-383
- Young, S.J.¹ Woodland, P.C.²

34
- 0002144369
- Tree-based state tying for high accuracy acoustic modelling
- Morgan Kaufman
- S.J. Young, J.J. Odell, and P.C. Woodland Tree-based state tying for high accuracy acoustic modelling Proceedings of 1994 ARPA Human Language Technology Workshop 1994 Morgan Kaufman pp. 307-312
- (1994) Proceedings of 1994 ARPA Human Language Technology Workshop
- Young, S.J.¹ Odell, J.J.² Woodland, P.C.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.