SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 3145-3149

The Cambridge university 2014 BOLT conversational telephone Mandarin Chinese lvcsr system for speech translation

(6) Liu, Xunying a Flego, Federico a Wang, Linlin a Zhang, Chao a Gales, Mark a Woodland, Philip a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

Character LM; Conversational speech transcription; RNNLM; Speech translation; System combination

Indexed keywords

BOLTS; DECODING; RECURRENT NEURAL NETWORKS; SPEECH; SPEECH TRANSMISSION; TELEPHONE SETS; TRANSCRIPTION;

CHARACTER LM; CONVERSATIONAL SPEECH; RNNLM; SPEECH TRANSLATION; SYSTEM COMBINATION;

SPEECH COMMUNICATION;

EID: 84959109976 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (13)

References (42)

1
- 84857788825
- Springer
- J. Olive, C. Caitlin and J. McCary eds. "Handbook of natural language processing and machine translation: DARPA global au-tonomous language exploitation", Springer, 2011.
- (2011) Handbook of Natural Language Processing and Machine Translation: DARPA Global Au-tonomous Language Exploitation
- Olive, J.¹ Caitlin, C.² McCary eds, J.³

2
- 0030638031
- A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
- Santa Barbara, CA
- J. G. Fiscus (1997). "A post-processing system to yield reduced word error rates: recogniser output voting error reduction (ROVER), " in Proc. IEEE ASRU, Santa Barbara, CA, pp. 347-354.
- (1997) Proc. IEEE ASRU , pp. 347-354
- Fiscus, J.G.¹

3
- 4544253834
- Posterior probability decoding, confidence estimation and system combination
- College Park, MD
- G. Evermann and P. C. Woodland (2000), "Posterior probability decoding, confidence estimation and system combination, " in Proc. Speech Transcription Workshop, College Park, MD, 2000.
- (2000) Proc. Speech Transcription Workshop
- Evermann, G.¹ Woodland, P.C.²

4
- 34547548228
- Speech system combination for machine translation
- Honolulu, HI
- M. J. F. Gales, X. Liu, R. Sinha, P. C. Woodland, K. Yu, S. Matsoukas, T. Ng, K. Nguyen, L. Nguyen, J.-L. Gauvain, L. Lamel, A. Messaoudi (2007). "Speech System Combination for Machine Translation, " in Proc. IEEE ICASSP, Honolulu, HI, 2007, vol. 4, pp. 1277-1280.
- (2007) Proc. IEEE ICASSP , vol.4 , pp. 1277-1280
- Gales, M.J.F.¹ Liu, X.² Sinha, R.³ Woodland, P.C.⁴ Yu, K.⁵ Matsoukas, S.⁶ Ng, T.⁷ Nguyen, K.⁸ Nguyen, L.⁹ Gauvain, J.-L.¹⁰ Lamel, L.¹¹ Messaoudi, A.¹²

5
- 0001076101
- A stocastic finite-state word-segmentation algorithm for Chinese
- R. Sproat, C. Shih, N. Chang, and W. Gale. (1996). A stocastic finite-state word-segmentation algorithm for Chinese, in Computational Linguistics, Vol. 22, Issue, 3, 1996, pp. 377-404.
- (1996) Computational Linguistics , vol.22 , Issue.3 , pp. 377-404
- Sproat, R.¹ Shih, C.² Chang, N.³ Gale, W.⁴

6
- 84872073683
- Syllable language models for Mandarin speech recognition: Exploiting character sequence models
- January
- X. Liu, J. L. Hieronymus, M. J. F. Gales and P. C. Woodland (2013). "Syllable language models for Mandarin speech recognition: exploiting character sequence models", Journal of the Acoustical Society of America, Volume 133, Issue 1, pp. 519-528, January 2013.
- (2013) Journal of the Acoustical Society of America , vol.133 , Issue.1 , pp. 519-528
- Liu, X.¹ Hieronymus, J.L.² Gales, M.J.F.³ Woodland, P.C.⁴

7
- 0029747183
- Speaker normalization using efficient frequency warping procedures
- Atlanta, GA
- L. Lee, and R. C. Rose (1996) "Speaker normalization using efficient frequency warping procedures, " in Proc. IEEE ICASSP, Atlanta, GA, 1996, vol. 1, pp. 353-356.
- (1996) Proc. IEEE ICASSP , vol.1 , pp. 353-356
- Lee, L.¹ Rose, R.C.²

8
- 84959142742
- A general artificial neural network extension for HTK
- C. Zhang, and P. C. Woodland (2015). "A general artificial neural network extension for HTK", in submission to ISCA Interspeech.
- (2015) ISCA Interspeech
- Zhang, C.¹ Woodland, P.C.²

9
- 84055222005
- Context-dependent pretrained deep neural networks for large vocabulary speech recognition
- January
- G. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pretrained deep neural networks for large vocabulary speech recognition", in IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30-42, January 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

10
- 0003573244
- Kluwer Academic Publishers, Norwell, MA, USA
- H. A. Bourlard and N. Morgan (1993). "Connectionist speech recognition: A hybrid approach", Kluwer Academic Publishers, Norwell, MA, USA, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.A.¹ Morgan, N.²

11
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- Istanbul, Turkey
- H. Hermansky, D. Ellis and S. Sharma (2000). "Tandem connectionist feature extraction for conventional HMM systems", in Proc. IEEE ICASSP, Istanbul, Turkey, vol. 3, pp. 1635-1638.
- (2000) Proc. IEEE ICASSP , vol.3 , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

12
- 84865785753
- Improved bottleneck features using pretrained deep neural networks
- Florence, Italy
- D. Yu and M. L. Seltzer (2011). "Improved bottleneck features using pretrained deep neural networks", in Proc. ISCA Interspeech, Florence, Italy, 2011, pp. 237-240.
- (2011) Proc. ISCA Interspeech , pp. 237-240
- Yu, D.¹ Seltzer, M.L.²

13
- 84903160476
- Paraphrastic language models
- November
- X. Liu, M. J. F. Gales, and P. C. Woodland (2014). "Paraphrastic language models", Computer Speech and Language, vol. 28, Issue 6, pp. 1298-1316, November 2014.
- (2014) Computer Speech and Language , vol.28 , Issue.6 , pp. 1298-1316
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

14
- 84946066405
- Paraphrastic recurrent neural network language models
- Brisbane, Australia
- X. Liu, M. J. F. Gales, and P. C. Woodland (2015), "Paraphrastic recurrent neural network language models, " in Proc. IEEE ICASSP, Brisbane, Australia, 2015.
- (2015) Proc. IEEE ICASSP
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

15
- 33947703664
- The CU-HTK Mandarin broadcast news transcription system
- Toulouse, France
- R. Sinha, M. J. F. Gales, D. Y. Kim, X. Liu, K. C. Sim, and P. C. Woodland (2006). "The CU-HTK Mandarin broadcast news transcription system, " in Proc. IEEE ICASSP, Toulouse, France, 2006, vol. 1, pp. 1077-1080.
- (2006) Proc. IEEE ICASSP , vol.1 , pp. 1077-1080
- Sinha, R.¹ Gales, M.J.F.² Kim, D.Y.³ Liu, X.⁴ Sim, K.C.⁵ Woodland, P.C.⁶

16
- 33646821390
- Development of the CUHTK 2004 Mandarin conversational telephone speech transcription system
- Philadelphia, PA
- M. J. F. Gales, B. Jia, X. Liu, K. C. Sim, P. C. Woodland, and K. Yu (2005). "Development of the CUHTK 2004 Mandarin conversational telephone speech transcription system, " in Proc. IEEE ICASSP, Philadelphia, PA, 2005, vol. 1, pp. 841-844.
- (2005) Proc. IEEE ICASSP , vol.1 , pp. 841-844
- Gales, M.J.F.¹ Jia, B.² Liu, X.³ Sim, K.C.⁴ Woodland, P.C.⁵ Yu, K.⁶

17
- 0042256392
- The development of the 1996 HTK broadcast news transcription system
- Arden House, NY, US
- P. C. Woodland, M. J. F. Gales, D. Pye, and S. J. Young (1996). "The development of the 1996 HTK broadcast news transcription system", in Proc. DARPA Speech Recognition Workshop, Arden House, NY, US, pp. 73-78.
- (1996) Proc. DARPA Speech Recognition Workshop , pp. 73-78
- Woodland, P.C.¹ Gales, M.J.F.² Pye, D.³ Young, S.J.⁴

18
- 84903171411
- The Kaldi speech recognition toolkit. http: //kaldi. sourceforge. net
- The Kaldi Speech Recognition Toolkit

19
- 0003871508
- PhD Thesis, John Hopkins University
- N. Kumar (1997). Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition, PhD Thesis, John Hopkins University.
- (1997) Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition
- Kumar, N.¹

20
- 0141703325
- Automatic complexity control for HLDA systems
- Hong Kong, China
- X. Liu, M. J. F. Gales, and P. C. Woodland (2003). "Automatic complexity control for HLDA systems", in Proc. IEEE ICASSP, Hong Kong, China, vol. 1, pp. 132-135.
- (2003) Proc. IEEE ICASSP , vol.1 , pp. 132-135
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

21
- 0002144369
- Tree-based state tying for high accuracy acoustic modeling
- Morgan Kaufman
- S. J. Young, J. J. Odell, and P. C. Woodland (1994). Tree-based State Tying for High Accuracy Acoustic Modeling, in Proc. ARPA Human Language Age Technology Workshop, Morgan Kaufman, 1994, pp. 307-312.
- (1994) Proc. ARPA Human Language Age Technology Workshop , pp. 307-312
- Young, S.J.¹ Odell, J.J.² Woodland, P.C.³

22
- 80051623316
- Investigation of acoustic units for LVCSR systems
- Prague, Czech Republic
- X. Liu, M. J. F. Gales, J. L. Hieronymus and P. C. Woodland (2011). "Investigation of acoustic units for LVCSR systems", in Proc. IEEE ICASSP, Prague, Czech Republic, pp. 4872-4875.
- (2011) Proc. IEEE ICASSP , pp. 4872-4875
- Liu, X.¹ Gales, M.J.F.² Hieronymus, J.L.³ Woodland, P.C.⁴

23
- 0036296863
- Minimum phone error and I-smoothing for improved discriminative training
- Orlando, FL 2002
- D. Povey and P. C. Woodland (2002). "Minimum phone error and I-smoothing for improved discriminative training", in Proc. IEEE ICASSP, Orlando, FL, 2002, vol. 1 105-108.
- (2002) Proc. IEEE ICASSP , vol.1 , pp. 105-108
- Povey, D.¹ Woodland, P.C.²

24
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales (1998). "Maximum likelihood linear transformations for HMM-based speech recognition, " Computer Speech and Language, 12 (2): 75-98, 1998.
- (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

25
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density HMMs
- C. J. Leggetter and P. C. Woodland (1995). "Maximum likelihood linear regression for speaker adaptation of continuous density HMMs", Computer Speech and Language, 9 (2): 171-185, 1995.
- (1995) Computer Speech and Language , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

26
- 60749097551
- S. Young G. Evermann, M. J. F. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev and P. C. Woodland. "The HTK Book Version 3. 4. 1", 2009.
- (2009) The HTK Book Version 3. 4. 1
- Young Evermann G, S.¹ Gales, M.J.F.² Hain, T.³ Kershaw, D.⁴ Liu, X.⁵ Moore, G.⁶ Odell, J.⁷ Ollason, D.⁸ Povey, D.⁹ Valtchev, V.¹⁰ Woodland, P.C.¹¹

27
- 34547548235
- Probabilistic and bottle-neck features for LVCSR of meetings
- Honolulu, HI
- F. Grezl, M. Karafiat, S. Kontar and J. Cernocky, "Probabilistic and bottle-neck features for LVCSR of meetings", in Proc. IEEE ICASSP, Honolulu, HI, 2007, vol. 4, pp. 757-760.
- (2007) Proc. IEEE ICASSP , vol.4 , pp. 757-760
- Grezl, F.¹ Karafiat, M.² Kontar, S.³ Cernocky, J.⁴

28
- 84890492591
- Revisiting hybrid and GMM-HMM system combination techniques
- Vancouver, Canada
- P. Swietojanski, A. Ghoshal, and S. Renals (2013). "Revisiting hybrid and GMM-HMM system combination techniques, " in IEEE ICASSP, Vancouver, Canada, 2013, pp. 6744-6748.
- (2013) IEEE ICASSP , pp. 6744-6748
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

29
- 84905265980
- Joint training of convolutional and non-convolutional neural networks
- Florence, Italy
- H. Soltau, G. Saon, and T. N. Sainath (2014). "Joint training of convolutional and non-convolutional neural networks, " in IEEE ICASSP, Florence, Italy, 2014, pp. 5572-5576.
- (2014) IEEE ICASSP , pp. 5572-5576
- Soltau, H.¹ Saon, G.² Sainath, T.N.³

30
- 0032638856
- Semi-tied covariance matrices for hidden markov models
- M. J. F. Gales (1999). "Semi-tied Covariance Matrices for Hidden Markov Models", IEEE Transactions on Speech and Audio Processing, pp. 272-281, vol. 7, 1999.
- (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , pp. 272-281
- Gales, M.J.F.¹

31
- 79959829092
- Recurrent neural network based language model
- Makuhari, Japan
- T. Mikolov, M. Karafiat, L. Burget, J. Cernocky, and S. Khudanpur (2010), "Recurrent neural network based language model, " in Proc. ISCA Interspeech, Makuhari, Japan, 2010, pp. 1045-1048.
- (2010) Proc. ISCA Interspeech , pp. 1045-1048
- Mikolov, T.¹ Karafiat, M.² Burget, L.³ Cernocky, J.⁴ Khudanpur, S.⁵

32
- 84910067710
- Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch
- Singapore
- X. Chen, Y. Wang, X. Liu, M. J. F. Gales and P. C. Woodland (2014). "Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch", in Proc. ISCA Interspeech, Singapore, 2014, pp. 641-645.
- (2014) Proc. ISCA Interspeech , pp. 641-645
- Chen, X.¹ Wang, Y.² Liu, X.³ Gales, M.J.F.⁴ Woodland, P.C.⁵

33
- 84906240855
- Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system
- Lyon, France
- Y. Si, Q. Zhang, T. Li, J. Pan, and Y. Yan (2013), "Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system, " in Proc. ISCA Interspeech, Lyon, France, 2013, pp. 3419-3423.
- (2013) Proc. ISCA Interspeech , pp. 3419-3423
- Si, Y.¹ Zhang, Q.² Li, T.³ Pan, J.⁴ Yan, Y.⁵

34
- 0034296009
- Finding consensus in speech recognition: Word error minimization and other applications of confusion networks
- L. Mangu, E. Brill, and A. Stolcke (2000). "Finding consensus in speech recognition: word error minimization and other applications of confusion networks, " Computer Speech and Language, 14 (4): 373-400, 2000.
- (2000) Computer Speech and Language , vol.14 , Issue.4 , pp. 373-400
- Mangu, L.¹ Brill, E.² Stolcke, A.³

35
- 84905240726
- Efficient lattice rescoring using recurrent neural network language models
- Florence, Italy
- X. Liu, Y. Wang, X. Chen, M. J. F. Gales, and P. C. Woodland (2014), "Efficient lattice rescoring using recurrent neural network language models, " in Proc. IEEE ICASSP, Florence, Italy, 2014, pp. 4941-4945.
- (2014) Proc. IEEE ICASSP , pp. 4941-4945
- Liu, X.¹ Wang, Y.² Chen, X.³ Gales, M.J.F.⁴ Woodland, P.C.⁵

36
- 84867332205
- Use of contexts in language model interpolation and adaptation
- January 2013
- X. Liu, M. J. F. Gales, and P. C. Woodland (2013), "Use of contexts in language model interpolation and adaptation, " Computer Speech and Language, vol. 27, no. 1, pp. 301-321, January 2013.
- (2013) Computer Speech and Language , vol.27 , Issue.1 , pp. 301-321
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

37
- 84875943582
- Language model cross adaptation for LVCSR system combination
- June 2013
- X. Liu, M. J. F. Gales & P. C. Woodland (2013). "Language model cross adaptation for LVCSR system combination", Computer Speech and Language, vol. 27, no. 4, pp. 928-942, June 2013.
- (2013) Computer Speech and Language , vol.27 , Issue.4 , pp. 928-942
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

38
- 84959136701
- Palisades, NY, Rich Transcription Workshop 2004
- P. C. Woodland et al. (2004). SuperEARS: Multi-site Broadcast News System, Rich Transcription Workshop 2004, Palisades, NY.
- (2004) SuperEARS: Multi-site Broadcast News System
- Woodland, P.C.¹

39
- 4544354321
- Speech recognition in multiple languages and domains: The 2003 BBN/LIMSI EARS system
- Montreal, Canada
- R. Schwartz et al. (2004). Speech Recognition in Multiple Languages and Domains: The 2003 BBN/LIMSI EARS System, in Proc. IEEE ICASSP, Montreal, Canada, 2004, vol. 3, pp. 753-756.
- (2004) Proc. IEEE ICASSP , vol.3 , pp. 753-756
- Schwartz, R.¹

40
- 78049384511
- The 2009 IBM gale Mandarin broadcast transcription system
- Dallas, TX 2010
- S. M. Chu et al. (2010). "The 2009 IBM GALE Mandarin Broadcast Transcription System, " in Proc. IEEE ICASSP, Dallas, TX, 2010, pp. 4374-4377.
- (2010) Proc. IEEE ICASSP , pp. 4374-4377
- Chu, S.M.¹

41
- 0028996852
- The 1994 HTK Large vocabulary speech recognition system
- Detroit, MI
- P. C. Woodland, C. J. Leggetter, J. J. Odell, V. Valtchev, and S. J. Young. (1995). "The 1994 HTK Large Vocabulary Speech Recognition System, " in Proc. IEEE ICASSP, Detroit, MI, pp. 73-76.
- (1995) Proc. IEEE ICASSP , pp. 73-76
- Woodland, P.C.¹ Leggetter, C.J.² Odell, J.J.³ Valtchev, V.⁴ Young, S.J.⁵

42
- 33745208455
- The 2004 bbn/limsi 20xrt english conversational telephone speech recognition system
- Lisboa, Portugal 2005
- R. Prasad et al. (2005). "The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system, " in Proc. ISCA Interspeech, Lisboa, Portugal, 2005, pp. 1645-1648.
- (2005) Proc. ISCA Interspeech , pp. 1645-1648
- Prasad, R.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.