SCOPUS 정보 검색 플랫폼

Journal of the Acoustical Society of America

Volumn 133, Issue 1, 2013, Pages 519-528

Syllable language models for Mandarin speech recognition: Exploiting character language models

(4) Liu, Xunying a Hieronymus, James L b Gales, Mark J F a Woodland, Philip C a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

b INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

Author keywords

[No Author keywords available]

Indexed keywords

AUDIO-RECOGNITION; CHARACTER ERROR RATES; CHARACTER LEVEL; CHINESE LANGUAGE; LANGUAGE MODEL; LINGUISTIC INFORMATION; MANDARIN CHINESE; MANDARIN SPEECH RECOGNITION; MODEL-BASED OPC; SPEECH RECOGNITION PERFORMANCE; SPOKEN LANGUAGES; TEXT SOURCES; WORD LEVEL;

SPEECH RECOGNITION;

COMPUTATIONAL LINGUISTICS;

ARTICLE; AUTOMATED PATTERN RECOGNITION; AUTOMATIC SPEECH RECOGNITION; DECISION TREE; HUMAN; PHONETICS; SIGNAL PROCESSING; SPEECH; STATISTICAL MODEL;

DECISION TREES; HUMANS; LINEAR MODELS; PATTERN RECOGNITION, AUTOMATED; PHONETICS; SIGNAL PROCESSING, COMPUTER-ASSISTED; SPEECH ACOUSTICS; SPEECH RECOGNITION SOFTWARE;

EID: 84872073683 PISSN: 00014966 EISSN: None Source Type: Journal
DOI: 10.1121/1.4768800 Document Type: Article

Times cited : (22)

References (44)

1
- 84889584217
- (University of Hawaii Press, Honolulu)
- J. de Francis, The Chinese Language: Fact and Fantasy (University of Hawaii Press, Honolulu, 1984), pp. 1-344.
- (1984) The Chinese Language: Fact and Fantasy , pp. 1-344
- De Francis, J.¹

2
- 0001076101
- A stochastic finite-state word-segmentation algorithm for Chinese
- R. Sproat, C. Shih, N. Chang, and W. Gale, " A stochastic finite-state word-segmentation algorithm for Chinese.," Comput. Linguist. 22 (3), 377-404 (1996).
- (1996) Comput. Linguist. , vol.22 , Issue.3 , pp. 377-404
- Sproat, R.¹ Shih, C.² Chang, N.³ Gale, W.⁴

3
- 78049384511
- The 2009 IBM GALE Mandarin broadcast transcription system
- in, Dallas.
- S. M. Chu, D. Povey, Hong-Kwang Kuo, L. Mangu, S. Zhang, Q. Shi, and Y. Qin, " The 2009 IBM GALE Mandarin broadcast transcription system.," in Proceedings of IEEE ICASSP2010, Dallas (2010).
- (2010) Proceedings of IEEE ICASSP2010
- Chu, S.M.¹ Povey, D.² Kuo, H.-K.³ Mangu, L.⁴ Zhang, S.⁵ Shi, Q.⁶ Qin, Y.⁷

4
- 34547548228
- Speech system combination for machine translation
- in, Hawaii.
- M. J. F. Gales, X. Liu, R. Sinha, P. C. Woodland, K. Yu, S. Matsoukas, T. Ng, K. Nguyen, L. Nguyen, J.-L. Gauvain, L. Lamel, and A. Messaoudi, " Speech system combination for machine translation.," in Proceedings of IEEE ICASSP2007, Hawaii (2007).
- (2007) Proceedings of IEEE ICASSP2007
- Gales, M.J.F.¹ Liu, X.² Sinha, R.³ Woodland, P.C.⁴ Yu, K.⁵ Matsoukas, S.⁶ Ng, T.⁷ Nguyen, K.⁸ Nguyen, L.⁹ Gauvain, J.-L.¹⁰ Lamel, L.¹¹ Messaoudi, A.¹²

5
- 51449120443
- Progress in the BBN 2007 Mandarin speech to text system
- in, Las Vegas.
- T. Ng, B. Zhang, K. Nguyen, and L. Nguyen, " Progress in the BBN 2007 Mandarin speech to text system.," in Proceedings of IEEE ICASSP2008, Las Vegas (2008).
- (2008) Proceedings of IEEE ICASSP2008
- Ng, T.¹ Zhang, B.² Nguyen, K.³ Nguyen, L.⁴

6
- 85178580924
- The syllable
- in, edited by H. van der Hulst and Norval Smieth (Fortus, Dordrecht), Vol.
- E. O. Selkirk, " The syllable.," in The Structure of Phonological Representations, edited by, H. van der Hulst, and, Norval Smieth, (Fortus, Dordrecht, 1982), Vol. 2, pp. 337-385.
- (1982) The Structure of Phonological Representations , vol.2 , pp. 337-385
- Selkirk, E.O.¹

7
- 0026240875
- Markov modeling of Mandarin Chinese for decoding the phonetic sequence into Chinese characters
- 10.1016/0885-2308(91)90004-A
- H. Gu, C. Tseng, and L. Lee, " Markov modeling of Mandarin Chinese for decoding the phonetic sequence into Chinese characters.," Comput. Speech Lang. 5, 363-371 (1991). 10.1016/0885-2308(91)90004-A
- (1991) Comput. Speech Lang. , vol.5 , pp. 363-371
- Gu, H.¹ Tseng, C.² Lee, L.³

8
- 70450189170
- Exploiting Chinese character models to improve speech recognition performance
- in, Brighton.
- J. L. Hieronymus, X. Liu, M. J. F. Gales, and P. C. Woodland, " Exploiting Chinese character models to improve speech recognition performance.," in Proceedings of Interspeech'09, Brighton (2009).
- (2009) Proceedings of Interspeech'09
- Hieronymus, J.L.¹ Liu, X.² Gales, M.J.F.³ Woodland, P.C.⁴

9
- 0027578488
- Golden Mandarin (I) - A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary
- 10.1109/89.222876
- L. S. Lee, C. Y. Tseng, H. Y. Gu, F. H. Liu, C. H. Chang, Y. H. Lin, Y. Lee, S. L. Tu, S. H. Hsieh, and C. H. Chen, " Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary.," IEEE Trans. Speech Audio Process. 1 (2), 158-179 (1993). 10.1109/89.222876
- (1993) IEEE Trans. Speech Audio Process. , vol.1 , Issue.2 , pp. 158-179
- Lee, L.S.¹ Tseng, C.Y.² Gu, H.Y.³ Liu, F.H.⁴ Chang, C.H.⁵ Lin, Y.H.⁶ Lee, Y.⁷ Tu, S.L.⁸ Hsieh, S.H.⁹ Chen, C.H.¹⁰

10
- 78049379945
- Language model combination and adaptation using weighted finite state transducers
- in, Dallas.
- X. Liu, M. J. F. Gales, J. L. Hieronymus, and P. C. Woodland, " Language model combination and adaptation using weighted finite state transducers.," in Proceedings of IEEE ICASSP2010, Dallas (2010).
- (2010) Proceedings of IEEE ICASSP2010
- Liu, X.¹ Gales, M.J.F.² Hieronymus, J.L.³ Woodland, P.C.⁴

11
- 70349210890
- Modeling characters versus words for Mandarin speech recognition
- in.
- J. Luo, L. Lamel, and J-L. Gauvain, " Modeling characters versus words for Mandarin speech recognition.," in Proceedings of ICASSP2009 (2009).
- (2009) Proceedings of ICASSP2009
- Luo, J.¹ Lamel, L.² Gauvain, J.-L.³

12
- 0031103274
- Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data
- 10.1109/89.554782
- H. M. Wang, T. H. Ho, R. C. Yang, J. L. Shen, B. R. Bai, J. C. Hong, W. P. Chen, T. L. Yu, and L. S. Lee, " Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data.," IEEE Trans. Speech Audio Process. 5 (2), 195-200 (1997). 10.1109/89.554782
- (1997) IEEE Trans. Speech Audio Process. , vol.5 , Issue.2 , pp. 195-200
- Wang, H.M.¹ Ho, T.H.² Yang, R.C.³ Shen, J.L.⁴ Bai, B.R.⁵ Hong, J.C.⁶ Chen, W.P.⁷ Yu, T.L.⁸ Lee, L.S.⁹

13
- 0023312404
- Estimation of probabilities from sparse data for the language model component of a speech recognizer
- 10.1109/TASSP.1987.1165125
- S. M. Katz, " Estimation of probabilities from sparse data for the language model component of a speech recognizer.," IEEE Trans. Acoust., Speech, Signal Process. 35 (3), 400-401 (1987). 10.1109/TASSP.1987.1165125
- (1987) IEEE Trans. Acoust., Speech, Signal Process. , vol.35 , Issue.3 , pp. 400-401
- Katz, S.M.¹

14
- 0030715425
- Language model adaptation using mixtures and an exponentially decaying cache
- in, Munich
- P. Clarkson and A. Robinson, " Language model adaptation using mixtures and an exponentially decaying cache.," in Proceedings of ICASSP1997, Munich (1997), pp. 799-802.
- (1997) Proceedings of ICASSP1997 , pp. 799-802
- Clarkson, P.¹ Robinson, A.²

15
- 0019114666
- Interpolated estimation of Markov source parameters from sparse data
- in, edited by E. S. Gelsema and L. N. Kanal (Norh-Holland, Amsterdam)
- F. Jelinek and R. Mercer, " Interpolated estimation of Markov source parameters from sparse data.," in Pattern Recognition in Practice, edited by, E. S. Gelsema, and, L. N. Kanal, (Norh-Holland, Amsterdam, 1980), pp. 381-402.
- (1980) Pattern Recognition in Practice , pp. 381-402
- Jelinek, F.¹ Mercer, R.²

16
- 0030181951
- A maximum entropy approach to adaptive statistical language modeling
- 10.1006/csla.1996.0011
- R. Rosenfeld, " A maximum entropy approach to adaptive statistical language modeling.," Comput. Speech Lang. 10, 187-228 (1996). 10.1006/csla.1996.0011
- (1996) Comput. Speech Lang. , vol.10 , pp. 187-228
- Rosenfeld, R.¹

17
- 44949090835
- Getting more mileage from web text sources for conversational speech language modeling using class-dependent mixtures
- in, Edmonton.
- I. Bulyko, M. Ostendorf, and A. Stolcke " Getting more mileage from web text sources for conversational speech language modeling using class-dependent mixtures.," in Proceedings of HLT '03, Edmonton (2003).
- (2003) Proceedings of HLT '03
- Bulyko, I.¹ Ostendorf, M.² Stolcke, A.³

18
- 79959816682
- Language model adaptation for broadcast news trancription
- in, Paris.
- L. Chen, J-L. Gauvain, L. Lamel, G. Adda, and M. Adda, " Language model adaptation for broadcast news trancription.," in Proceedings of ISCA ITRW '01, Paris (2001).
- (2001) Proceedings of ISCA ITRW '01
- Chen, L.¹ Gauvain, J.-L.² Lamel, L.³ Adda, G.⁴ Adda, M.⁵

19
- 0009643324
- Efficient language model adaptation through MDI estimation
- in, Budapest.
- M. Federico, " Efficient language model adaptation through MDI estimation.," in Proceedings of EuroSpeech '99, Budapest (1999).
- (1999) Proceedings of EuroSpeech '99
- Federico, M.¹

20
- 84867218949
- Context dependent language model adaptation
- in, Brisbane.
- X. Liu, M. J. F. Gales, and P. C. Woodland, " Context dependent language model adaptation.," in Proceedings of Interspeech '08, Brisbane (2008).
- (2008) Proceedings of Interspeech '08
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

21
- 70450161305
- Use of contexts in language model interpolation and adaptation
- in, Brighton.
- X. Liu, M. J. F. Gales, and P. C. Woodland, " Use of contexts in language model interpolation and adaptation.," in Proceedings of Interspeech '09, Brighton (2009).
- (2009) Proceedings of Interspeech '09
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

22
- 0030638031
- A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
- in.
- J. G. Fiscus, " A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER).," in Proceedings of IEEE ASRU '97 (1997).
- (1997) Proceedings of IEEE ASRU '97
- Fiscus, J.G.¹

23
- 4544253834
- Posterior probability decoding, confidence estimation and system combination
- in.
- G. Evermann and P. C. Woodland, " Posterior probability decoding, confidence estimation and system combination.," in Proceedings of Speech Transcription Workshop 2000 (2000).
- (2000) Proceedings of Speech Transcription Workshop 2000
- Evermann, G.¹ Woodland, P.C.²

24
- 0013344078
- Training products of experts by minimizing contrastive divergence
- 10.1162/089976602760128018
- G. Hinton, " Training products of experts by minimizing contrastive divergence.," Neural Comput. 14, 1771-1800 (2002). 10.1162/ 089976602760128018
- (2002) Neural Comput. , vol.14 , pp. 1771-1800
- Hinton, G.¹

25
- 84891308106
- SRILM - An extensible language modeling toolkit
- in, Denver.
- A. Stolcke, " SRILM-An extensible language modeling toolkit.," in Proceedings of ICSLP '02, Denver (2002).
- (2002) Proceedings of ICSLP '02
- Stolcke, A.¹

26
- 84872081958
- The HTK Book Version 3.4.1, (2009) (last viewed 12/16/2011), 1-374.
- S. J. Young, G. Evermann, M. J. F. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. C. Woodland, The HTK Book Version 3.4.1, http://htk.eng.cam.ac.uk/prot-docs/htk-book.shtml (2009) (last viewed 12/16/2011), pp. 1-374.
- Young, S.J.¹ Evermann, G.² Gales, M.J.F.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.⁶ Moore, G.⁷ Odell, J.⁸ Ollason, D.⁹ Povey, D.¹⁰ Valtchev, V.¹¹ Woodland, P.C.¹²

27
- 0032661656
- Network optimizations for large vocabulary speech recognition
- 10.1016/S0167-6393(98)00026-0
- M. Mohri and M. Riley, " Network optimizations for large vocabulary speech recognition.," Speech Commun. 25 (3), 1-12 (1998). 10.1016/S0167-6393(98)00026-0
- (1998) Speech Commun. , vol.25 , Issue.3 , pp. 1-12
- Mohri, M.¹ Riley, M.²

28
- 0012306376
- The design principles of a weighted finite-state transducer library
- 10.1016/S0304-3975(99)00014-6
- M. Mohri, F. C. N. Pereira, and M. Riley, " The design principles of a weighted finite-state transducer library.," Theor. Comput. Sci. 231, 17-32 (2000). 10.1016/S0304-3975(99)00014-6
- (2000) Theor. Comput. Sci. , vol.231 , pp. 17-32
- Mohri, M.¹ Pereira, F.C.N.² Riley, M.³

29
- 0036460907
- Weighted finite-state transducers in speech recognition
- 10.1006/csla.2001.0184
- M. Mohri, F. C. N. Pereira, and M. Riley, " Weighted finite-state transducers in speech recognition.," Comput. Speech Lang. 16 (1), 69-88 (2002). 10.1006/csla.2001.0184
- (2002) Comput. Speech Lang. , vol.16 , Issue.1 , pp. 69-88
- Mohri, M.¹ Pereira, F.C.N.² Riley, M.³

30
- 70350376504
- Weighted automata algorithms
- in, edited by Manfred Droste, Werner Kuich, and Heiko Vogler (Springer, Berlin)
- M. Mohri, " Weighted automata algorithms.," in Handbook of Weighted Automata. Monographs in Theoretical Computer Science, edited by, Manfred Droste, Werner Kuich, and, Heiko Vogler, (Springer, Berlin, 2009), pp. 213-254.
- (2009) Handbook of Weighted Automata. Monographs in Theoretical Computer Science , pp. 213-254
- Mohri, M.¹

31
- 0001573124
- Generalized iterative scaling for log-linear models
- 10.1214/aoms/1177692379
- J. Darroch and D. Ratcliff, " Generalized iterative scaling for log-linear models.," Ann. Math. Stat. 43 (5), 1470-1480 (1972). 10.1214/aoms/1177692379
- (1972) Ann. Math. Stat. , vol.43 , Issue.5 , pp. 1470-1480
- Darroch, J.¹ Ratcliff, D.²

32
- 0035059194
- Whole-sentence exponential language models: A vehicle for linguistic-statistical integration
- 10.1006/csla.2000.0159
- R. Rosenfeld, S. F. Chen, and X. Zhu, " Whole-sentence exponential language models: A vehicle for linguistic-statistical integration.," Comput. Speech Lang. 15 (1), 55-73 (2001). 10.1006/csla.2000.0159
- (2001) Comput. Speech Lang. , vol.15 , Issue.1 , pp. 55-73
- Rosenfeld, R.¹ Chen, S.F.² Zhu, X.³

33
- 84875923754
- LM95 project report: Fast training and portability
- in, Research Note No. 1, Center for Language and Speech Processing, Johns Hopkins University, Baltimore , (last viewed 12/16/2011).
- M. Weintraub, Y. Aksu, S. Dharanipragada, S. Khudanpur, H. Ney, J. Prange, A. Stolcke, F. Jelinek, and E. Shriberg, " LM95 project report: Fast training and portability.," in 1995 Language Modeling Summer Research Workshop Technical Reports, Research Note No. 1, Center for Language and Speech Processing, Johns Hopkins University, Baltimore (1996), http://www-speech.sri. com/cgi-bin/run-distill?papers/lm95-report.ps.gz (last viewed 12/16/2011).
- (1996) 1995 Language Modeling Summer Research Workshop Technical Reports
- Weintraub, M.¹ Aksu, Y.² Dharanipragada, S.³ Khudanpur, S.⁴ Ney, H.⁵ Prange, J.⁶ Stolcke, A.⁷ Jelinek, F.⁸ Shriberg, E.⁹

34
- 0028996852
- The 1994 HTK large vocabulary speech recognition system
- in, Detroit.
- P. C. Woodland, C. J. Leggetter, J. J. Odell, V. Valtchev, and S. J. Young, " The 1994 HTK large vocabulary speech recognition system.," in Proceedings of IEEE ICASSP1995, Detroit (1995).
- (1995) Proceedings of IEEE ICASSP1995
- Woodland, P.C.¹ Leggetter, C.J.² Odell, J.J.³ Valtchev, V.⁴ Young, S.J.⁵

35
- 34047273021
- A specialized on-The-fly algorithm for lexicon and language model composition
- 10.1109/TSA.2005.860838
- D. A. Caseiro and I. Trancoso, " A specialized on-the-fly algorithm for lexicon and language model composition.," IEEE Trans. Audio, Speech, Lang. Process. 14 (4), 1281-1291 (2006). 10.1109/TSA.2005.860838
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.4 , pp. 1281-1291
- Caseiro, D.A.¹ Trancoso, I.²

36
- 33947703664
- The CU-HTK Mandarin broadcast news transcription system
- in, Toulouse.
- R. Sinha, M. J. F. Gales, D. Y. Kim, X. Liu, K. C. Kim, and P. C. Woodland, " The CU-HTK Mandarin broadcast news transcription system.," in Proceedings of IEEE ICASSP2006, Toulouse (2006).
- (2006) Proceedings of IEEE ICASSP2006
- Sinha, R.¹ Gales, M.J.F.² Kim, D.Y.³ Liu, X.⁴ Kim, K.C.⁵ Woodland, P.C.⁶

37
- 0036296863
- Minimum phone error and I-smoothing for improved discriminative training
- in, Orlando.
- D. Povey and P. C. Woodland, " Minimum phone error and I-smoothing for improved discriminative training.," in Proceedings of IEEE ICASSP2002, Orlando (2002).
- (2002) Proceedings of IEEE ICASSP2002
- Povey, D.¹ Woodland, P.C.²

38
- 0002144369
- Tree-based state tying for high accuracy acoustic modeling
- in (Morgan Kaufman, Plainsboro, NJ)
- S. J. Young, J. J. Odell, and P. C. Woodland, " Tree-based state tying for high accuracy acoustic modeling.," in Proceedings of ARPA Human Language Age Technology Workshop (Morgan Kaufman, Plainsboro, NJ, 1994), pp. 307-312.
- (1994) Proceedings of ARPA Human Language Age Technology Workshop , pp. 307-312
- Young, S.J.¹ Odell, J.J.² Woodland, P.C.³

39
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density HMMs
- 10.1006/csla.1995.0010
- C. J. Leggetter and P. C. Woodland, " Maximum likelihood linear regression for speaker adaptation of continuous density HMMs.," Comput. Speech Lang. 9, 171-186 (1995). 10.1006/csla.1995.0010
- (1995) Comput. Speech Lang. , vol.9 , pp. 171-186
- Leggetter, C.J.¹ Woodland, P.C.²

40
- 0003871508
- Ph.D. thesis, John Hopkins University, Baltimore.
- N. Kumar, " Investigation of silicon-auditory models and generalization of linear discriminant analysis for improved speech recognition.," Ph.D. thesis, John Hopkins University, Baltimore, 1997.
- (1997) Investigation of Silicon-auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition
- Kumar, N.¹

41
- 0141703325
- Automatic complexity control for HLDA systems
- in, Hong Kong , Vol.
- X. Liu, M. J. F. Gales, and P. C. Woodland, " Automatic complexity control for HLDA systems.," in Proceedings of IEEE ICASSP2003, Hong Kong (2003), Vol. 1, pp. 132-135.
- (2003) Proceedings of IEEE ICASSP2003 , vol.1 , pp. 132-135
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

42
- 0042256392
- The development of the 1996 HTK broadcast news transcription system
- in (Arden House, New York)
- P. C. Woodland, M. J. F. Gales, D. Pye, and S. J. Young, " The development of the 1996 HTK broadcast news transcription system.," in Proceedings of DARPA Speech Recognition Workshop (Arden House, New York, 1996), pp. 73-78.
- (1996) Proceedings of DARPA Speech Recognition Workshop , pp. 73-78
- Woodland, P.C.¹ Gales, M.J.F.² Pye, D.³ Young, S.J.⁴

43
- 0033329799
- An empirical study of smoothing techniques for language modeling
- 10.1006/csla.1999.0128
- S. F. Chen and J. T. Goodman, " An empirical study of smoothing techniques for language modeling.," Comput. Speech Lang. 13 (4), pp. 359-394 (1999). 10.1006/csla.1999.0128
- (1999) Comput. Speech Lang. , vol.13 , Issue.4 , pp. 359-394
- Chen, S.F.¹ Goodman, J.T.²

44
- 33847610331
- Continuous language models
- 10.1016/j.csl.2006.09.003
- H. Schwenk, " Continuous language models.," Comput. Speech Lang. 21 (3), 492-518 (2007). 10.1016/j.csl.2006.09.003
- (2007) Comput. Speech Lang. , vol.21 , Issue.3 , pp. 492-518
- Schwenk, H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.