SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 14, Issue 2, 2000, Pages 137-160

Pronunciation modeling by sharing Gaussian densities across phonetic models

(3) Saraçlar, Murat a Nock, Harriet b Khudanpur, Sanjeev a

a JOHNS HOPKINS UNIVERSITY (United States)

b UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

DECISION TREES; HIDDEN MARKOV MODELS; LINGUISTICS; TRELLIS CODES;

ACOUSTIC MODEL TRAININGS; AUTOMATIC SPEECH RECOGNITION; CONVERSATIONAL SPEECH; PHONETIC TRANSCRIPTIONS; PRONUNCIATION MODELING; PRONUNCIATION VARIATION; RECOGNITION ACCURACY; SPONTANEOUS SPEECH;

SPEECH RECOGNITION;

EID: 0000114416 PISSN: 08852308 EISSN: None Source Type: Journal
DOI: 10.1006/csla.2000.0140 Document Type: Article

Times cited : (78)

References (38)

1
- 0025629882
- Tied mixture continuous parameter modeling for speech recognition
- Bellegarda, J. R. & Nahamoo, D. (1990). Tied mixture continuous parameter modeling for speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 38, 2033-2045.
- (1990) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.38 , pp. 2033-2045
- Bellegarda, J.R.¹ Nahamoo, D.²

2
- 0343367210
- Phonological studies for speech recognition
- Bernstein, J., Baldwin, G., Cohen, M., Murveit, H. & Weintraub, M. (1986). Phonological studies for speech recognition. In DARPA Speech Recognition Workshop, pp. 41-48.
- (1986) DARPA Speech Recognition Workshop , pp. 41-48
- Bernstein, J.¹ Baldwin, G.² Cohen, M.³ Murveit, H.⁴ Weintraub, M.⁵

3
- 0030637976
- Pronunciation modelling for conversational speech recognition: A status report from WS97
- Santa Barbara, CA, USA
- Byrne, W., Finke, M., Khudanpur, S., McDonough, J., Nock, H., Riley, M., Saraclar, M., Wooters, C. & Zavaliagkos, G. (1997). Pronunciation modelling for conversational speech recognition: A status report from WS97. In IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings (ASRU), Santa Barbara, CA, USA, pp. 26-33.
- (1997) IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings (ASRU) , pp. 26-33
- Byrne, W.¹ Finke, M.² Khudanpur, S.³ McDonough, J.⁴ Nock, H.⁵ Riley, M.⁶ Saraclar, M.⁷ Wooters, C.⁸ Zavaliagkos, G.⁹

4
- 0031624621
- Pronunciation modelling using a hand-labelled corpus for conversational speech recognition
- Seattle, USA
- Byrne, W., Finke, M., Khudanpur, S., McDonough, J., Nock, H., Riley, M., Saraclar, M., Wooters, C. & Zavaliagkos, G. (1998). Pronunciation modelling using a hand-labelled corpus for conversational speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, USA, pp. 313-316.
- (1998) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 313-316
- Byrne, W.¹ Finke, M.² Khudanpur, S.³ McDonough, J.⁴ Nock, H.⁵ Riley, M.⁶ Saraclar, M.⁷ Wooters, C.⁸ Zavaliagkos, G.⁹

5
- 0025692329
- Identification of contextual factors for pronunciation networks
- Chen, F. (1990). Identification of contextual factors for pronunciation networks. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 753-756.
- (1990) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 753-756
- Chen, F.¹

6
- 0003721728
- Ph.D. Thesis. University of California, Berkeley
- Cohen, M. (1989). Phonological structures for speech recognition. Ph.D. Thesis. University of California, Berkeley.
- (1989) Phonological Structures for Speech Recognition
- Cohen, M.¹

7
- 0029375590
- Speaker adaptation using constrained estimation of Gaussian mixtures
- Digalakis, V. V., Rtischev, D. & Neumeyer, L. G. (1995). Speaker adaptation using constrained estimation of Gaussian mixtures. IEEE Transactions on Speech and Audio Processing, 3, 357-366.
- (1995) IEEE Transactions on Speech and Audio Processing , vol.3 , pp. 357-366
- Digalakis, V.V.¹ Rtischev, D.² Neumeyer, L.G.³

8
- 85135281657
- Automatic modeling of pronunciation variations
- Budapest, Hungary
- Eide, E. (1999). Automatic modeling of pronunciation variations. Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), Budapest, Hungary, pp. 451-454.
- (1999) Proceedings of the European Conference on Speech Communication and Technology (Eurospeech) , pp. 451-454
- Eide, E.¹

9
- 0028996886
- Understanding and improving speech recognition performance through the use of diagnostic tools
- Detroit, MI
- Eide, E., Gish, H., Jeanrenaud, P. & Mielke, A. (1995). Understanding and improving speech recognition performance through the use of diagnostic tools. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Detroit, MI, pp. 221-224.
- (1995) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 221-224
- Eide, E.¹ Gish, H.² Jeanrenaud, P.³ Mielke, A.⁴

10
- 80053229524
- Modeling and efficient decoding of large vocabulary conversational speech
- Budapest, Hungary
- Finke, M., Fritsch, J., Koll, D. & Waibel, A. (1999). Modeling and efficient decoding of large vocabulary conversational speech. Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), Budapest, Hungary, pp. 467-470.
- (1999) Proceedings of the European Conference on Speech Communication and Technology (Eurospeech) , pp. 467-470
- Finke, M.¹ Fritsch, J.² Koll, D.³ Waibel, A.⁴

11
- 85027454087
- Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition
- Finke, M. & Waibel, A. (1997). Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition. Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), pp. 2379-2382.
- (1997) Proceedings of the European Conference on Speech Communication and Technology (Eurospeech) , pp. 2379-2382
- Finke, M.¹ Waibel, A.²

12
- 0043272135
- Automatic learning of word pronunciation from data
- addendum
- Fosler, E., Weintraub, M., Wegmann, S., Kao, Y-H., Khudanpur, S., Galles, C. & Saraclar, M. (1996). Automatic learning of word pronunciation from data. Proceedings of the International Conference on Spoken Language Processing (ICSLP), pp. S28-S29 (addendum).
- (1996) Proceedings of the International Conference on Spoken Language Processing (ICSLP)
- Fosler, E.¹ Weintraub, M.² Wegmann, S.³ Kao, Y.-H.⁴ Khudanpur, S.⁵ Galles, C.⁶ Saraclar, M.⁷

13
- 0347605862
- Multi-level decision trees for static and dynamic pronunciation models
- Budapest, Hungary
- Fosler-Lussier, E. (1999). Multi-level decision trees for static and dynamic pronunciation models. Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), Budapest, Hungary, pp. 463-466.
- (1999) Proceedings of the European Conference on Speech Communication and Technology (Eurospeech) , pp. 463-466
- Fosler-Lussier, E.¹

14
- 0025642104
- Word juncture modeling using phonological rules for HMM-based continuous speech recognition
- Giachin, E., Rosenberg, A. & Lee, C. (1990). Word juncture modeling using phonological rules for HMM-based continuous speech recognition. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 737-740.
- (1990) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 737-740
- Giachin, E.¹ Rosenberg, A.² Lee, C.³

15
- 85016587886
- SWITCHBOARD: Telephone speech corpus for research and development
- Godfrey, J., Holliman, E. & McDaniel, J. (1992). SWITCHBOARD: Telephone speech corpus for research and development. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 517-520, Available at http://www.ldc.upenn.edu/.
- (1992) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 517-520
- Godfrey, J.¹ Holliman, E.² McDaniel, J.³

16
- 85031554304
- Technical Report, LVCSR Summer Workshop
- Greenberg, S. The switchboard transcription project. Technical Report, 1996 LVCSR Summer Workshop, http://www.icsi.berkeley.edu/real/stp/.
- (1996) The Switchboard Transcription Project
- Greenberg, S.¹

17
- 0348235816
- Recent experiments with the CU-HTK Hub5 system
- Hain, T. & Woodland, R (1999a). Recent experiments with the CU-HTK Hub5 system. 10th Hub-5 Conversational Speech Understanding Workshop.
- (1999) 10th Hub-5 Conversational Speech Understanding Workshop
- Hain, T.¹ Woodland, R.²

18
- 85135269907
- Dynamic HMM selection for continuous speech recognition
- Budapest, Hungary
- Hain, T. & Woodland, P. C. (1999b). Dynamic HMM selection for continuous speech recognition. Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), Budapest, Hungary, pp. 1327-1330.
- (1999) Proceedings of the European Conference on Speech Communication and Technology (Eurospeech) , pp. 1327-1330
- Hain, T.¹ Woodland, P.C.²

19
- 0000250399
- Semicontinuous hidden Markov models for speech signals
- Huang, X. D. & Jack, M. A. (1989). Semicontinuous hidden Markov models for speech signals. Computer Speech and Language, 3, 239-251.
- (1989) Computer Speech and Language , vol.3 , pp. 239-251
- Huang, X.D.¹ Jack, M.A.²

20
- 0029747183
- Speaker normalization using efficient frequency warping procedures
- Atlanta, GA
- Lee, L. & Rose, R. (1996). Speaker normalization using efficient frequency warping procedures. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Atlanta, GA, pp. 353-356.
- (1996) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 353-356
- Lee, L.¹ Rose, R.²

21
- 0029288633
- Speaker adaptation of continuous density HMMs using multivariate linear regression
- Leggetter, C. J. & Woodland, R C. (1995). Speaker adaptation of continuous density HMMs using multivariate linear regression. Computer Speech and Language, 9, 171-185.
- (1995) Computer Speech and Language , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, R.C.²

22
- 0021191078
- An information theoretic approach to the automatic determination of phonemic baseforms
- San Diego, CA
- Lucassen, J. M. & Mercer, R. L. (1984). An information theoretic approach to the automatic determination of phonemic baseforms. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), San Diego, CA, pp. 42.5.1-42.5.4.
- (1984) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 4251-4254
- Lucassen, J.M.¹ Mercer, R.L.²

23
- 85009080147
- PhD Thesis. The Johns Hopkins University, Baltimore, MD
- Luo, X. (1999). Balancing Model Resolution and Generalizability in Large Vocabulary Continuous Speech Recognition. PhD Thesis. The Johns Hopkins University, Baltimore, MD.
- (1999) Balancing Model Resolution and Generalizability in Large Vocabulary Continuous Speech Recognition
- Luo, X.¹

24
- 84943154470
- Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch
- Sydney, Australia
- McAllaster, D., Gillick, L., Scattone, F. & Newman, M. (1998). Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch. Proceedings of the International Conference on Spoken Language Processing (ICSLP), Sydney, Australia, pp. 1847-1850.
- (1998) Proceedings of the International Conference on Spoken Language Processing (ICSLP) , pp. 1847-1850
- McAllaster, D.¹ Gillick, L.² Scattone, F.³ Newman, M.⁴

25
- 0012306376
- The design principles of a weighted finite state transducer library
- Mohri, M., Pereira, F. C. N. & Riley, M. (2000). The design principles of a weighted finite state transducer library. Theoretical Computer Science, 231, 17-32, Available from http://www.research.att.com/sw/tools/ fsm/.
- (2000) Theoretical Computer Science , vol.231 , pp. 17-32
- Mohri, M.¹ Pereira, F.C.N.² Riley, M.³

26
- 0003805597
- PhD Thesis. Cambridge University Engineering Department
- Odell, J. (1995). The Use of Context in Large Vocabulary Speech Recognition. PhD Thesis. Cambridge University Engineering Department.
- (1995) The use of Context in Large Vocabulary Speech Recognition
- Odell, J.¹

27
- 0032639915
- Improvements in recognition of conversational telephone speech
- Peskin, B., Newman, M., McAllaster, D., Nagesha, V., Richards, H., Wegmann, S., Hunt, M. & Gillick, L. (1999). Improvements in recognition of conversational telephone speech. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 53-56.
- (1999) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 53-56
- Peskin, B.¹ Newman, M.² McAllaster, D.³ Nagesha, V.⁴ Richards, H.⁵ Wegmann, S.⁶ Hunt, M.⁷ Gillick, L.⁸

28
- 85031538332
- PronLex, COMLEX English Pronunciation, Available from http://www.ldc.upenn.edu/.
- COMLEX English Pronunciation

29
- 0026405248
- A statistical model for generating pronunciation networks
- Riley, M. (1991). A statistical model for generating pronunciation networks. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 737-740.
- (1991) Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 737-740
- Riley, M.¹

30
- 0033353288
- Stochastic pronunciation modelling from hand-labelled phonetic corpora
- Riley, M., Byrne, W., Finke, M., Khudanpur, S., Ljolje, A., McDonough, J., Nock, H., Saraclar, M., Wooters, C. & Zavaliagkos, G. (1999). Stochastic pronunciation modelling from hand-labelled phonetic corpora. Speech Communication, 29, 209-224.
- (1999) Speech Communication , vol.29 , pp. 209-224
- Riley, M.¹ Byrne, W.² Finke, M.³ Khudanpur, S.⁴ Ljolje, A.⁵ McDonough, J.⁶ Nock, H.⁷ Saraclar, M.⁸ Wooters, C.⁹ Zavaliagkos, G.¹⁰

31
- 0003921935
- Automatic generation of detailed pronunciation lexicons
- chapter 12, Kluwer Academic Press
- Riley, M. & Ljolje, A. (1995). Automatic generation of detailed pronunciation lexicons. Automatic Speech and Speaker Recognition : Advanced Topics, chapter 12, pp. 285-302. Kluwer Academic Press.
- (1995) Automatic Speech and Speaker Recognition : Advanced Topics , pp. 285-302
- Riley, M.¹ Ljolje, A.²

32
- 0030363039
- Dictionary learning for spontaneous speech recognition
- Philadelphia, USA
- Sloboda, T. & Waibel, A. (1996). Dictionary learning for spontaneous speech recognition. Proceedings of the International Conference on Spoken Language Processing (ICSLP), Philadelphia, USA, pp. 2328-2331.
- (1996) Proceedings of the International Conference on Spoken Language Processing (ICSLP) , pp. 2328-2331
- Sloboda, T.¹ Waibel, A.²

33
- 0033335618
- Modeling pronunciation variation for ASR: A survey of the literature
- Strik, H. & Cucchiarini, C. (1999). Modeling pronunciation variation for ASR: A survey of the literature. Speech Communication, 29, 225-246.
- (1999) Speech Communication , vol.29 , pp. 225-246
- Strik, H.¹ Cucchiarini, C.²

34
- 85135194422
- Building multiple pronunciation models for novel words using exploratory computational phonology
- Madrid, Spain
- Tajchman, G., Fosler, E. & Jurafsky, D. (1995). Building multiple pronunciation models for novel words using exploratory computational phonology. Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), Madrid, Spain, pp. 2247-2250.
- (1995) Proceedings of the European Conference on Speech Communication and Technology (Eurospeech) , pp. 2247-2250
- Tajchman, G.¹ Fosler, E.² Jurafsky, D.³

35
- 0033106613
- Multiple pronunciation dictionary using HMM-state confusion characteristics
- Wakita, Y., Singer, H. & Sagisaka, Y. (1999). Multiple pronunciation dictionary using HMM-state confusion characteristics. Computer Speech and Language, 13, 143-153.
- (1999) Computer Speech and Language , vol.13 , pp. 143-153
- Wakita, Y.¹ Singer, H.² Sagisaka, Y.³

36
- 0043086491
- Effect of speaking style on LVCSR performance
- addendum
- Weintraub, M., Taussig, K., Hunicke-Smith, K. & Snodgrass, A. (1996). Effect of speaking style on LVCSR performance. Proceedings of the International Conference on Spoken Language Processing (ICSLP), pp. S16-S19 (addendum).
- (1996) Proceedings of the International Conference on Spoken Language Processing (ICSLP)
- Weintraub, M.¹ Taussig, K.² Hunicke-Smith, K.³ Snodgrass, A.⁴

37
- 0343367122
- Multiple pronunciation lexical modeling in a speaker independent speech understanding system
- Wooters, C. & Stolcke, A. (1994). Multiple pronunciation lexical modeling in a speaker independent speech understanding system. Proceedings of the International Conference on Spoken Language Processing (ICSLP), pp. 1363-1366.
- (1994) Proceedings of the International Conference on Spoken Language Processing (ICSLP) , pp. 1363-1366
- Wooters, C.¹ Stolcke, A.²

38
- 0003571977
- Entropic Cambridge Research Laboratory
- Young, S., Jansen, J., Odell, J., Ollasen, D. & Woodland, P. The HTK Book (Version 2.0). Entropic Cambridge Research Laboratory.
- The HTK Book (Version 2.0)
- Young, S.¹ Jansen, J.² Odell, J.³ Ollasen, D.⁴ Woodland, P.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.