SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 21, Issue 2, 2013, Pages 357-366

Learning lexicons from speech using a pronunciation mixture model

(3) McGraw, Ian a Badr, Ibrahim a Glass, James R a

a Department of Electrical Engineering and Computer Science (United States)

Author keywords

Baseform generation; dictionary training with acoustics via EM; pronunciation learning; stochastic lexicon

Indexed keywords

ACHILLES HEEL; AUTOMATIC SPEECH RECOGNIZERS; BASEFORM GENERATION; BASEFORM PRONUNCIATION; CONTINUOUS SPEECH; HIGH QUALITY; LANGUAGE MODEL; MANUAL INTERVENTION; MIXTURE MODEL; PARAMETER SETTING; PRONUNCIATION LEARNING; SPEECH DATA; STOCHASTIC LEXICON; TRAINING DATA; WEATHER INFORMATION;

COMPUTATIONAL LINGUISTICS;

STOCHASTIC SYSTEMS;

EID: 84871369973 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2226158 Document Type: Article

Times cited : (40)

References (34)

1
- 79959854710
- Learning new word pronunciations from spoken examples
- I. Badr, I. McGraw, and J. R. Glass, "Learning new word pronunciations from spoken examples, " in Proc. INTERSPEECH, 2010, pp. 2294-2297.
- (2010) Proc. INTERSPEECH , pp. 2294-2297
- Badr, I.¹ McGraw, I.² Glass, J.R.³

2
- 84865763465
- Pronunciation learning from continuous speech
- I. Badr, I. McGraw, and J. R. Glass, "Pronunciation learning from continuous speech, " in Proc. INTERSPEECH, 2011, pp. 549-552.
- (2011) Proc. INTERSPEECH , pp. 549-552
- Badr, I.¹ McGraw, I.² Glass, J.R.³

3
- 41049105254
- Joint-sequence models for grapheme-tophoneme conversion
- May
- M. Bisani and H. Ney, "Joint-sequence models for grapheme-tophoneme conversion, " Speech Commun., vol. 50, no. 5, pp. 434-451, May 2008.
- (2008) Speech Commun , vol.50 , Issue.5 , pp. 434-451
- Bisani, M.¹ Ney, H.²

4
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm, " J. R. Statist. Soc., Ser. B, vol. 39, no. 1, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc., Ser. B , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

5
- 0020719320
- Maximum likelihood approach to continuous speech recognition
- L. R. Bahl, F. Jelinek, and R. L. Mercer, "A maximum likelihood approach to continuous speech recognition, " IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-5, no. 2, pp. 179-190, Mar. 1983. (Pubitemid 13555897)
- (1983) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.PAMI-5 , Issue.2 , pp. 179-190
- Bahl Lalit, R.¹ Jelinek Frederick² Mercer Robert, L.³

6
- 0038359548
- A probabilistic framework for segment-based speech recognition
- J. R. Glass, "A probabilistic framework for segment-based speech recognition, " Comput. Speech Lang., vol. 17, no. 2-3, pp. 137-152, 2003.
- (2003) Comput. Speech Lang , vol.17 , Issue.2-3 , pp. 137-152
- Glass, J.R.¹

7
- 85009152019
- The MIT finite-state transducer toolkit for speech and language processing
- I. L. Hetherington, "The MIT finite-state transducer toolkit for speech and language processing, " in Proc. INTERSPEECH, 2004, pp. 2609-2612.
- (2004) Proc. INTERSPEECH , pp. 2609-2612
- Hetherington, I.L.¹

8
- 85009074656
- An efficient implementation of phonological rules using finite-state transducers
- I. L. Hetherington, "An efficient implementation of phonological rules using finite-state transducers, " in Proc. EuroSpeech, 2001, pp. 1599-1602.
- (2001) Proc. EuroSpeech , pp. 1599-1602
- Hetherington, I.L.¹

9
- 19944423811
- Pronunciation modeling using a finite-state transducer representation
- DOI 10.1016/j.specom.2005.03.004, PII S0167639305000361, Pronunciation Modeling and Lexicon Adaptation
- T. J. Hazen, I. L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation, " Speech Commun., vol. 46, no. 2, pp. 189-203, 2005. (Pubitemid 40753202)
- (2005) Speech Communication , vol.46 , Issue.2 , pp. 189-203
- Hazen, T.J.¹ Hetherington, I.L.² Shu, H.³ Livescu, K.⁴

10
- 0033335618
- Modeling pronunciation variation for ASR: A survey of the literature
- DOI 10.1016/S0167-6393(99)00038-2
- H. Strik and C. Cucchiarini, "Modeling pronunciation variation for ASR: A survey of the literature, " Speech Commun., vol. 29, no. 2-4, pp. 225-246, 1999. (Pubitemid 30514833)
- (1999) Speech Communication , vol.29 , Issue.2 , pp. 225-246
- Strik, H.¹ Cucchiarini, C.²

11
- 0012262424
- Ph. D. dissertation, Massachusetts Inst. of Technol., Cambridge, MA
- I. L. Hetherington, "A characterization of the problem of new, out-of-vocabulary words in continuous-speech recognition and understanding, " Ph. D. dissertation, Massachusetts Inst. of Technol., Cambridge, MA, 1995.
- (1995) A Characterization of the Problem of New, Out-of-vocabulary Words in Continuous-speech Recognition and Understanding
- Hetherington, I.L.¹

12
- 0035278951
- Confidence measures for large vocabulary continuous speech recognition
- DOI 10.1109/89.906002, PII S1063667601013281
- F. Wessel, R. Schlüter, K. Macherey, and H. Ney, "Confidence measures for large vocabulary continuous speech recognition, " IEEE Trans. Speech Audio Process., vol. 9, no. 3, pp. 288-298, Mar. 2001. (Pubitemid 32286598)
- (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.3 , pp. 288-298
- Wessel, F.¹ Schluter, R.² Macherey, K.³ Ney, H.⁴

13
- 33745202406
- Open vocabulary speech recognition with flat hybrid models
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- M. Bisani and H. Ney, "Open vocabulary speech recognition with flat hybrid models, " in Proc. INTERSPEECH, 2005, pp. 725-728. (Pubitemid 43908165)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 725-728
- Bisani, M.¹ Ney, H.²

14
- 85009227369
- Conditional and joint models for grapheme-to-phoneme conversion
- S. F. Chen, "Conditional and joint models for grapheme-to-phoneme conversion, " in Proc. INTERSPEECH, 2003, pp. 2033-2036.
- (2003) Proc. INTERSPEECH , pp. 2033-2036
- Chen, S.F.¹

15
- 84878203695
- Regular models of phonological rule systems
- R. M. Kaplan and M. Kay, "Regular models of phonological rule systems, " Comput. Linguist., vol. 20, pp. 331-378, 1994.
- (1994) Comput. Linguist , vol.20 , pp. 331-378
- Kaplan, R.M.¹ Kay, M.²

16
- 79959852715
- Reversible sound-to-letter/letter-to-sound modeling based on syllable structure
- S. Seneff, "Reversible sound-to-letter/letter-to-sound modeling based on syllable structure, " in Proc. Human Lang. Technol. : Conf. North Amer. Chap. Assoc. for Comput. Linguist. (HLT-NACCL), 2007, pp. 153-156.
- (2007) Proc. Human Lang. Technol. : Conf. North Amer. Chap. Assoc. for Comput. Linguist. (HLT-NACCL , pp. 153-156
- Seneff, S.¹

17
- 0039255896
- A multi-strategy approach to improving pronunciation by analogy
- Y. Marchand and R. I. Damper, "A multi-strategy approach to improving pronunciation by analogy, " Comput. Linguist., vol. 26, pp. 195-219, 2000.
- (2000) Comput. Linguist , vol.26 , pp. 195-219
- Marchand, Y.¹ Damper, R.I.²

18
- 19944409831
- Unsupervised, language-independent grapheme-to-phoneme conversion by latent analogy
- DOI 10.1016/j.specom.2005.03.002, PII S0167639305000336, Pronunciation Modeling and Lexicon Adaptation
- J. Bellegarda, "Unsupervised, language-independent grapheme-tophoneme conversion by latent analogy, " Speech Commun., vol. 46, no. 2, pp. 140-152, 2005. (Pubitemid 40753199)
- (2005) Speech Communication , vol.46 , Issue.2 , pp. 140-152
- Bellegarda, J.R.¹

19
- 70450194704
- Grapheme to phoneme conversion using an SMT system
- A. Laurent, P. Delglise, and S. Meignier, "Grapheme to phoneme conversion using an SMT system, " in Proc. INTERSPEECH, 2009, pp. 708-711.
- (2009) Proc. INTERSPEECH , pp. 708-711
- Laurent, A.¹ Delglise, P.² Meignier, S.³

20
- 70450186703
- Online discriminative training for grapheme-to-phoneme conversion
- S. Jiampojamarn and G. Kondrak, "Online discriminative training for grapheme-to-phoneme conversion, " in Proc. INTERSPEECH, 2009, pp. 1303-1306.
- (2009) Proc. INTERSPEECH , pp. 1303-1306
- Jiampojamarn, S.¹ Kondrak, G.²

21
- 85044582416
- Letter to sound rules for accented lexicon compression
- V. Pagel, K. Lenzo, and A. W. Black, "Letter to sound rules for accented lexicon compression, " in Proc. Int. Conf. Spoken Lang. Process. (ICSLP), 1998.
- (1998) Proc. Int. Conf. Spoken Lang. Process. (ICSLP
- Pagel, V.¹ Lenzo, K.² Black, A.W.³

22
- 18244423993
- Assessing text-to-phoneme mapping strategies in speaker independent isolated word recognition
- J. Häkkinen, J. Suontausta, S. Riis, and K. J. Jensen, "Assessing text-to-phoneme mapping strategies in speaker independent isolated word recognition, " Speech Commun., vol. 41, no. 2-3, pp. 455-467, 2003.
- (2003) Speech Commun , vol.41 , Issue.2-3 , pp. 455-467
- Häkkinen, J.¹ Suontausta, J.² Riis, S.³ Jensen, K.J.⁴

23
- 79959846061
- M. S. thesis, Massachusetts Inst. of Technol., Cambridge, MA
- S. Wang, "Using graphone models in automatic speech recognition, " M. S. thesis, Massachusetts Inst. of Technol., Cambridge, MA, 2009.
- (2009) Using Graphone Models in Automatic Speech Recognition
- Wang, S.¹

24
- 84943154470
- Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch
- D. McAllaster, L. Gillick, F. Scattone, and M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch, " in Proc. Int. Conf. Spoken Lang. Process. (ICSLP), 1998.
- (1998) Proc. Int. Conf. Spoken Lang. Process. (ICSLP
- McAllaster, D.¹ Gillick, L.² Scattone, F.³ Newman, M.⁴

25
- 84956975318
- Automatic baseform generation from acoustic data
- B. Maison, "Automatic baseform generation from acoustic data, " in Proc. INTERSPEECH, 2003, pp. 2545-2548.
- (2003) Proc. INTERSPEECH , pp. 2545-2548
- Maison, B.¹

26
- 51449103917
- A turbo-style algorithm for lexical baseforms estimation
- G. F. Choueiter, M. I. Ohannessian, S. Seneff, and J. R. Glass, "A turbo-style algorithm for lexical baseforms estimation, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2008, pp. 4313-4316.
- (2008) Proc IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP , pp. 4313-4316
- Choueiter, G.F.¹ Ohannessian, M.I.² Seneff, S.³ Glass, J.R.⁴

27
- 0026372223
- Automatic phonetic baseform determination
- L. R. Bahl, S. Das, P. V. Desouza, M. Epstein, R. L. Mercer, B. Merialdo, D. Nahamoo, M. A. Picheny, and J. Powell, "Automatic phonetic baseform determination, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1991, pp. 173-176.
- (1991) Proc IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP , pp. 173-176
- Bahl, L.R.¹ Das, S.² Desouza, P.V.³ Epstein, M.⁴ Mercer, R.L.⁵ Merialdo, B.⁶ Nahamoo, D.⁷ Picheny, M.A.⁸ Powell, J.⁹

28
- 44849099982
- Adapting grapheme-tophoneme conversion for name recognition
- X. Li, A. Gunawardana, and A. Acero, "Adapting grapheme-tophoneme conversion for name recognition, " in Proc. Autom. Speech Recognit. Understand. Workshop (ASRU), 2007, pp. 130-135.
- (2007) Proc. Autom. Speech Recognit. Understand. Workshop (ASRU , pp. 130-135
- Li, X.¹ Gunawardana, A.² Acero, A.³

29
- 70349209414
- Discriminative pronounciation learning using phonetic decoder and minimum-classification-error criterion
- O. Vinyals, L. Deng, D. Yu, and A. Acero, "Discriminative pronounciation learning using phonetic decoder and minimum-classification-error criterion, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2009, pp. 4445-4448.
- (2009) Proc IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP , pp. 4445-4448
- Vinyals, O.¹ Deng, L.² Yu, D.³ Acero, A.⁴

30
- 0033878021
- JUPITER: A telephone-based conversational interface for weather information
- DOI 10.1109/89.817460
- V. Zue, S. Seneff, J. Glass, J. Polifroni, C. Pao, T. J. Hazen, and L. Hetherington, "Jupiter: A telephone-based conversational interface for weather information, " IEEE Trans. Speech Audio Process., vol. 8, no. 1, pp. 85-96, Jan. 2000. (Pubitemid 30540738)
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.1 , pp. 85-96
- Zue Victor¹ Seneff Stephanie² Glass James, R.³ Polifroni Joseph⁴ Pao Christine⁵ Hazen Timothy, J.⁶ Hetherington Lee⁷

31
- 19544365323
- lDC Catalog LDC97L20
- P. Kingsbury, S. Strassel, and R. MacIntyre, CALLHOME American English lexicon (PRONLEX), 1997, lDC Catalog No. LDC97L20.
- (1997) CALLHOME American English Lexicon (PRONLEX
- Kingsbury, P.¹ Strassel, S.² MacIntyre, R.³

32
- 0024909979
- Some statistical issues in the comparison of speech recognition algorithms
- L. Gillick and S. Cox, "Some statistical issues in the comparison of speech recognition algorithms, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1989, pp. 532-535. (Pubitemid 20604171)
- (1989) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 532-535
- Gillick, L.¹ Cox, S.J.²

33
- 84878563022
- Automating crowd-supervised learning for spoken language systems
- I. McGraw, S. Cyphers, P. Pasupat, J. Liu, and J. Glass, "Automating crowd-supervised learning for spoken language systems, " in Proc. INTERSPEECH, 2012.
- (2012) Proc. INTERSPEECH
- McGraw, I.¹ Cyphers, S.² Pasupat, P.³ Liu, J.⁴ Glass, J.⁵

34
- 84867809023
- A nonparametric Bayesian approach to acoustic model discovery
- C. Lee and J. R. Glass, "A nonparametric Bayesian approach to acoustic model discovery, " in Proc. 50th Annu. Meeting Assoc. Comput. Linguist. (ACL), 2012, pp. 40-49.
- (2012) Proc. 50th Annu. Meeting Assoc. Comput. Linguist. (ACL , pp. 40-49
- Lee, C.¹ Glass, J.R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.