SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 25, Issue 3, 2011, Pages 519-534

The efficient incorporation of MLP features into automatic speech recognition systems

(5) Park, J a Diehl, F a Gales, M J F a Tomalin, M a Woodland, P C a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

Acoustic modelling; Automatic speech recognition; MLP feature; Speaker adaptation

Indexed keywords

ACOUSTIC ENVIRONMENT; ACOUSTIC FEATURES; ACOUSTIC MODEL; ACOUSTIC MODELLING; ADAPTATION SCHEME; AUTOMATIC SPEECH RECOGNITION; AUTOMATIC SPEECH RECOGNITION SYSTEM; BROADCAST CONVERSATION; BROADCAST NEWS; CONSISTENT PERFORMANCE; DESIGN DECISIONS; DISCRIMINATIVE TRAINING; LARGE AMOUNTS OF DATA; LARGE VOCABULARY SPEECH RECOGNITION; MLP FEATURE; MULTI LAYER PERCEPTRON; MULTI-PASS; NETWORK ADAPTATION; PERFORMANCE GAIN; SPEAKER ADAPTATION; SPEECH RECOGNITION SYSTEMS; SPEED-UPS; SUB-SYSTEMS; TEST DATA; TRAINING CORPUS;

FEATURE EXTRACTION; STANDARDS;

SPEECH RECOGNITION;

EID: 79251574977 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2010.07.005 Document Type: Article

Times cited : (24)

References (43)

1
- 53049096459
- LP-TRAP: Linear predictive temporal patterns
- M. Athineos, H. Hermansky, and D.P.W. Ellis LP-TRAP: linear predictive temporal patterns Proc. of ICSLP 2004
- (2004) Proc. of ICSLP
- Athineos, M.¹ Hermansky, H.² Ellis, D.P.W.³

2
- 0003573244
- Kluwer Academic Publishers
- H. Bourlard, and N. Morgan Connectionist Speech Recognition - A Hybrid Approach 1994 Kluwer Academic Publishers
- (1994) Connectionist Speech Recognition - A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

3
- 0842269614
- Version 1.0. Linguistic Data Consortium, University of Pennsylvania
- Buckwalter, T.; 2002. Buckwalter Arabic Morphological Analyzer Version 1.0. Linguistic Data Consortium, University of Pennsylvania.
- (2002) Buckwalter Arabic Morphological Analyzer
- Buckwalter, T.¹

4
- 0030351194
- Boosting the performance of connectionist large vocabulary speech recognition
- G. Cook, and A. Robinson Boosting the performance of connectionist large vocabulary speech recognition Proc. ICSLP 1996
- (1996) Proc. ICSLP
- Cook, G.¹ Robinson, A.²

5
- 79952617990
- David, J.; 2004. ICSI QuickNet Software Package. http://www.icsi. berkeley.edu/Speech/qn.html.
- (2004)
- David, J.¹

6
- 51449111857
- Phonetic pronunciations for Arabic speech-to-text systems
- F. Diehl, M.J.F. Gales, M. Tomalin, and P.C. Woodland Phonetic pronunciations for Arabic speech-to-text systems Proc. of ICASSP 2008
- (2008) Proc. of ICASSP
- Diehl, F.¹ Gales, M.J.F.² Tomalin, M.³ Woodland, P.C.⁴

7
- 33645760470
- Training LVCSR systems on thousands of hours of data
- G. Evermann, H.Y. Chan, M.J.F. Gales, B. Jia, D. Mrva, P.C. Woodland, and K. Yu Training LVCSR systems on thousands of hours of data. Proc. ICASSP 2005 209 212
- (2005) Proc. ICASSP , pp. 209-212
- Evermann, G.¹ Chan, H.Y.² Gales, M.J.F.³ Jia, B.⁴ Mrva, D.⁵ Woodland, P.C.⁶ Yu, K.⁷

8
- 0033676943
- Large vocabulary decoding and confidence estimation using word posterior probabilities
- G. Evermann, and P.C. Woodland Large vocabulary decoding and confidence estimation using word posterior probabilities Proc. ICASSP 2000 2366 2369
- (2000) Proc. ICASSP , pp. 2366-2369
- Evermann, G.¹ Woodland, P.C.²

9
- 4544253834
- Posterior probability decoding, confidence estimation and system combination
- Evermann, G.; Woodland, P. C.; 2000b. Posterior probability decoding, confidence estimation and system combination. In: Proc. of Speech Transcription Workshop.
- (2000) Proc. of Speech Transcription Workshop
- Evermann, G.¹ Woodland, P.C.²

10
- 84946728861
- Design of fast LVCSR systems
- G. Evermann, and P.C. Woodland Design of fast LVCSR systems Proc. ASRU 2003 7 12
- (2003) Proc. ASRU , pp. 7-12
- Evermann, G.¹ Woodland, P.C.²

11
- 0030638031
- A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)
- J.G. Fiscus A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER) Proc. IEEE Workshop: Automatic Speech Recognition and Understanding 1997 347 354
- (1997) Proc. IEEE Workshop: Automatic Speech Recognition and Understanding , pp. 347-354
- Fiscus, J.G.¹

12
- 85065669950
- Nonlinear discriminant analysis for improved speech recognition
- V. Fontaine, C. Ris, and J.M. Boite Nonlinear discriminant analysis for improved speech recognition Proc. EUROSPEECH 1997
- (1997) Proc. EUROSPEECH
- Fontaine, V.¹ Ris, C.² Boite, J.M.³

13
- 53049104569
- On the use of MLP features for broadcast news transcription
- Springer Verlag
- Fousek, P.; Lamel, L.; Gauvain, J.-L.; 2008a. On the use of MLP features for broadcast news transcription. In: Lecture Notes in Computer Science. Springer Verlag, pp. 303-310.
- (2008) Lecture Notes in Computer Science , pp. 303-310
- Fousek, P.¹ Lamel, L.² Gauvain, J.-L.³

14
- 84867209138
- Transcribing broadcast data using MLP features
- P. Fousek, L. Lamel, and J.-L. Gauvain Transcribing broadcast data using MLP features Proc. of Interspeech 2008 1433 1436
- (2008) Proc. of Interspeech , pp. 1433-1436
- Fousek, P.¹ Lamel, L.² Gauvain, J.-L.³

15
- 51449119901
- March Ph.D. Thesis. Czech Technical University in Prague, Faculty of Electrical Engineering, Prague
- Fousek, P.; March 2007. Extraction of Features for Automatic Recognition of Speech Based on Spectral Dynamics. Ph.D. Thesis. Czech Technical University in Prague, Faculty of Electrical Engineering, Prague.
- (2007) Extraction of Features for Automatic Recognition of Speech Based on Spectral Dynamics
- Fousek, P.¹

16
- 40249114845
- Transfer learning for tandem ASR feature extraction
- DOI 10.1007/978-3-540-78155-4-20, Machine Learning for Multimodal Interaction - 4th International Workshop, MLMI 2007, Revised Selected Papers
- J. Frankel, Ö. etin, and N. Morgan Transfer learning for tandem ASR feature extraction Lecture Notes in Computer Science 4892 2008 227 236 (Pubitemid 351333603)
- (2008) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol.4892 , pp. 227-236
- Frankel, J.¹ Cetin, O.² Morgan, N.³

17
- 44849119873
- Development of a phonetic system for large vocabulary Arabic speech recognition
- M.J.F. Gales, F. Diehl, C.K. Raut, M. Tomalin, P.C. Woodland, and K. Yu Development of a phonetic system for large vocabulary Arabic speech recognition Proc. of ASRU 2007 24 29
- (2007) Proc. of ASRU , pp. 24-29
- Gales, M.J.F.¹ Diehl, F.² Raut, C.K.³ Tomalin, M.⁴ Woodland, P.C.⁵ Yu, K.⁶

18
- 34047266379
- Progress in the CU-HTK broadcast news transcription system
- DOI 10.1109/TASL.2006.878264
- M.J.F. Gales, D.Y. Kim, P.C. Woodland, H.Y. Chan, D. Mrva, R. Shinha, and S.E. Tranter Progress in the CU-HTK broadcast news transcription system IEEE Transactions on Audio Speech and Language Processing 14 2006 1513 1525 (Pubitemid 46547578)
- (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.5 , pp. 1513-1525
- Gales, M.J.F.¹ Kim, D.Y.² Woodland, P.C.³ Chan, H.Y.⁴ Mrva, D.⁵ Sinha, R.⁶ Tranter, S.E.⁷

19
- 0030263447
- Mean and variance adaptation within the MLLR framework
- DOI 10.1006/csla.1996.0013
- M.J.F. Gales, and P.C. Woodland Mean and variance adaptation within the MLLR framework Computer Speech & Language 10 1996 249 264 (Pubitemid 126374488)
- (1996) Computer Speech and Language , vol.10 , Issue.4 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

20
- 0003671941
- Ph.D. Thesis. University of Cambridge
- Gales, M.J.F.; 1995. Model-based Techniques for Noise Robust Speech Recognition. Ph.D. Thesis. University of Cambridge.
- (1995) Model-based Techniques for Noise Robust Speech Recognition
- Gales, J.M.F.¹

21
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M.J.F. Gales Maximum likelihood linear transformations for HMM based speech recognition Computer Speech & Language 12 1998 75 98 (Pubitemid 128383747)
- (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

22
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- M.J.F. Gales Semi-tied covariance matrices for hidden Markov models IEEE Transactions Speech and Audio Processing 7 1999 272 281
- (1999) IEEE Transactions Speech and Audio Processing , vol.7 , pp. 272-281
- Gales, M.J.F.¹

23
- 0031640333
- Linear Input Network based speaker adaptation in the dialogos system
- R. Gemello, F. Mana, and D. Albesano Linear Input Network based speaker adaptation in the dialogos system Proc. of IJCNN 1998 2190 2195
- (1998) Proc. of IJCNN , pp. 2190-2195
- Gemello, R.¹ Mana, F.² Albesano, D.³

24
- 0024909979
- Some statistical issues in the comparison of speech recognition algorithms
- Gillick, L.; Cox, S.J.; Inc, D.S.; A.; N.M.; 1989. Some statistical issues in the comparison of speech recognition algorithms. In: Proc. ICASSP, pp. 532-535. (Pubitemid 20604171)
- (1989) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 532-535
- Gillick, L.¹ Cox, S.J.²

25
- 34547548235
- Probabilistic and bottle-neck features for LVCSR of meetings
- F. Grézl, M. Karafiát, S. Kontár, and J. Černocký Probabilistic and bottle-neck features for LVCSR of meetings Proc. of ICASSP 2007 757 760
- (2007) Proc. of ICASSP , pp. 757-760
- Grézl, F.¹ Karafiát, M.² Kontár, S.³ Č ernocký, J.⁴

26
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- H. Hermansky, D.P.W. Ellis, and S. Sharma Tandem connectionist feature extraction for conventional HMM systems Proc. of ICASSP 2000
- (2000) Proc. of ICASSP
- Hermansky, H.¹ Ellis, D.P.W.² Sharma, S.³

27
- 27144439262
- Data-derived nonlinear mapping for feature extraction in HMM
- H. Hermansky, S. Sharma, and P. Jain Data-derived nonlinear mapping for feature extraction in HMM Proc. ASRU 1999
- (1999) Proc. ASRU
- Hermansky, H.¹ Sharma, S.² Jain, P.³

28
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- DOI 10.1121/1.399423
- H. Hermansky Perceptual linear prediction (PLP) analysis for speech The Journal of the Acoustical Society of America 87 April 1990 1738 1752 (Pubitemid 20256470)
- (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

29
- 77249135915
- Robust heteroscedastic linear discriminant analysis and LCRC posterior features in meeting data recognition
- Springer Verlag
- Karafiát, M.; Grézl, F.; Schwarz, P.; Burget, L.; Černocký, J.; 2006. Robust heteroscedastic linear discriminant analysis and LCRC posterior features in meeting data recognition. In: Lecture Notes in Computer Science. Springer Verlag, pp. 275-284.
- (2006) Lecture Notes in Computer Science , pp. 275-284
- Karafiát, M.¹ Grézl, F.² Schwarz, P.³ Burget, L.⁴ Černocký, J.⁵

30
- 0003871508
- Ph.D. Thesis. John Hopkins University
- Kumar, N.; 1997. Investigation of Silicon-auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition. Ph.D. Thesis. John Hopkins University.
- (1997) Investigation of Silicon-auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition
- Kumar, N.¹

31
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C.J. Leggetter, and P.C. Woodland Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Computer Speech & Language 9 1995 171 185
- (1995) Computer Speech & Language , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

32
- 79952620572
- Linguistic Data Consortium
- Linguistic Data Consortium, 2010. Global Autonomous Language Exploitation (GALE). http://projects.ldc.upenn.edu/gale/.
- (2010) Global Autonomous Language Exploitation (GALE)

33
- 84923190448
- Combination of acoustic models in continuous speech recognition hybrid systems
- H. Meinedo, and J.P. Neto Combination of acoustic models in continuous speech recognition hybrid systems Proc. ICSLP 2000 931 934
- (2000) Proc. ICSLP , pp. 931-934
- Meinedo, H.¹ Neto, J.P.²

34
- 0025680226
- Tools for the analysis of benchmark speech recognition tests
- D.S. Pallet, W.M. Fisher, and J.G. Fiscus Tools for the analysis of benchmark speech recognition tests Proc. ICASSP 1990 97 100
- (1990) Proc. ICASSP , pp. 97-100
- Pallet, D.S.¹ Fisher, W.M.² Fiscus, J.G.³

35
- 70349224630
- Training and adapting MLP features for Arabic speech recognition
- J. Park, F. Diehl, M.J.F. Gales, M. Tomalin, and P.C. Woodland Training and adapting MLP features for Arabic speech recognition Proc. of ICASSP 2009 4461 4464
- (2009) Proc. of ICASSP , pp. 4461-4464
- Park, J.¹ Diehl, F.² Gales, M.J.F.³ Tomalin, M.⁴ Woodland, P.C.⁵

36
- 0141480019
- Discriminative MAP for acoustic model adaptation
- D. Povey, P.C. Woodland, and M.J.F. Gales Discriminative MAP for acoustic model adaptation Proc. ICASSP 2003 312 315
- (2003) Proc. ICASSP , pp. 312-315
- Povey, D.¹ Woodland, P.C.² Gales, M.J.F.³

37
- 0036296863
- Minimum Phone Error and I-smoothing for improved discriminative training
- D. Povey, and P.C. Woodland Minimum Phone Error and I-smoothing for improved discriminative training Proc. ICASSP 2002
- (2002) Proc. ICASSP
- Povey, D.¹ Woodland, P.C.²

38
- 4544265717
- Ph.D. Thesis. University of Cambridge
- Povey, D.; 2004. Discriminative Training for Large Vocabulary Speech Recognition. Ph.D. Thesis. University of Cambridge.
- (2004) Discriminative Training for Large Vocabulary Speech Recognition
- Povey, D.¹

39
- 4544250610
- Speaker adaptation using lattice-based MLLR
- L.F. Uebel, and P.C. Woodland Speaker adaptation using lattice-based MLLR Proc. ITRW on Adaptation Methods for Speech Recognition 2001
- (2001) Proc. ITRW on Adaptation Methods for Speech Recognition
- Uebel, L.F.¹ Woodland, P.C.²

40
- 0036461035
- Large scale discriminative training of hidden Markov models for speech recognition
- P.C. Woodland, and D. Povey Large scale discriminative training of hidden Markov models for speech recognition Computer Speech & Language 16 2002 25 47
- (2002) Computer Speech & Language , vol.16 , pp. 25-47
- Woodland, P.C.¹ Povey, D.²

41
- 0003822743
- Version 3.4.1. Cambridge University Engineering Department, Cambridge, UK
- Young, S. J.; Evermann, G.; Gales, M. J. F.; Hain, T.; Kershaw, D.; Liu, X.; Moore, G.; Odell, J.; Ollason, D.; Povey, D.; Valtchev, V.; Woodland, P. C.; 2009. The HTK Book, Version 3.4.1. Cambridge University Engineering Department, Cambridge, UK. http://htk.eng.cam.ac.uk/.
- (2009) The HTK Book
- Young, S.J.¹ Evermann, G.² Gales, J.M.F.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.⁶ Moore, G.⁷ Odell, J.⁸ Ollason, D.⁹ Povey, D.¹⁰ Valtchev, V.¹¹ Woodland, P.C.¹²

42
- 85009097225
- On using MLP features in LVCSR
- Q. Zhu, B. Chen, N. Morgan, and A. Stolcke On using MLP features in LVCSR Proc. ICSLP 2004 921 924
- (2004) Proc. ICSLP , pp. 921-924
- Zhu, Q.¹ Chen, B.² Morgan, N.³ Stolcke, A.⁴

43
- 33745185321
- Using MLP features in SRI's conversational speech recognition system
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- Q. Zhu, A. Stolcke, B.Y. Chen, and N. Morgan Using MLP features in SRI's conversational speech recognition system Proc. of Interspeech 2005 2141 2144 (Pubitemid 43908517)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 2141-2144
- Zhu, Q.¹ Stolcke, A.² Chen, B.Y.³ Morgan, N.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.