SCOPUS 정보 검색 플랫폼

IEEE Transactions on Speech and Audio Processing

Volumn 13, Issue 6, 2005, Pages 1173-1185

Automatic transcription of conversational telephone speech

(8) Hain, Thomas a,b,c Woodland, Philip C a,c Evermann, Gunnar a,c,d Gales, Mark J F a,c Liu, Xunying a,c,d,e Moore, Gareth L a,c,f Povey, Dan a,c Wang, Lan a,c,g,h

a UNIVERSITY OF SHEFFIELD (United Kingdom)

b VIENNA UNIVERSITY OF TECHNOLOGY (Austria)

c UNIVERSITY OF CAMBRIDGE (United Kingdom)

d UNIVERSITY OF HAMBURG (Germany)

e SHANGHAI JIAO TONG UNIVERSITY (China)

f UNIVERSITY OF WARWICK (United Kingdom)

g BEIJING INSTITUTE OF TECHNOLOGY (China)

h PEKING UNIVERSITY (China)

Author keywords

Large vocabulary conversational speech recognition; Telephone speech recognition

Indexed keywords

ACOUSTIC NOISE; AUTOMATION; COMPUTER SIMULATION; DATA ACQUISITION; INTERPOLATION; SPEECH ANALYSIS; TELEPHONE SYSTEMS;

LARGE VOCUBLARY CONVERSATIONAL SPEECH RECOGNITION; LINEAR ANALYSIS; TELEPHONE SPEECH; TELEPHONE SPEECH RECOGNITION;

SPEECH RECOGNITION;

EID: 27744599401 PISSN: 10636676 EISSN: None Source Type: Journal
DOI: 10.1109/TSA.2005.852999 Document Type: Article

Times cited : (17)

References (37)

1
- 0030362995
- A compact model for speaker-adaptive training
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. ICSLP, 1996, pp. 1137-1140.
- (1996) Proc. ICSLP , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

2
- 0003396042
- An empirical study of smoothing techniques for language modeling
- Computer Science Group, Harvard Univ., Cambridge, MA
- S. F. Chen and J. Goodman, "An Empirical Study of Smoothing Techniques for Language Modeling," Computer Science Group, Harvard Univ., Cambridge, MA, Tech. Rep. TR-10-98, 1998.
- (1998) Tech. Rep. , vol.TR-10-98
- Chen, S.F.¹ Goodman, J.²

3
- 4544253834
- Posterior probability decoding, confidence estimation and system combination
- College Park, MD
- G. Evermann and P. C. Woodland, "Posterior probability decoding, confidence estimation and system combination," in Proc. Speech Transcription Workshop, College Park, MD, 2000.
- (2000) Proc. Speech Transcription Workshop
- Evermann, G.¹ Woodland, P.C.²

4
- 0030638031
- A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)
- Santa Barbara, CA
- J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)," in Proc. IEEE ASRU Workshop, Santa Barbara, CA, 1997, pp. 347-354.
- (1997) Proc. IEEE ASRU Workshop , pp. 347-354
- Fiscus, J.G.¹

5
- 0030263447
- Mean and variance adaptation within the MLLR framework
- M. J. F. Gales and P. C. Woodland, "Mean and variance adaptation within the MLLR framework," Comput. Speech Lang., vol. 10, pp. 249-264, 1996.
- (1996) Comput. Speech Lang. , vol.10 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

6
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, pp. 75-98, 1998.
- (1998) Comput. Speech Lang. , vol.12 , pp. 75-98
- Gales, M.J.F.¹

7
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- _, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 7, pp. 272-281, 1999.
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 272-281

8
- 0003245997
- The LIMSI Nov93 WSJ System
- Plainsboro, NJ
- J.-L. Gauvain, L. F. Lamel, G. Adda, and M. Adda-Decker, "The LIMSI Nov93 WSJ System," in Proc. SLT'94, Plainsboro, NJ, 1994, pp. 125-128.
- (1994) Proc. SLT'94 , pp. 125-128
- Gauvain, J.-L.¹ Lamel, L.F.² Adda, G.³ Adda-Decker, M.⁴

9
- 0027311604
- Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech
- L. Gillick, J. Baker, J. Baker, J. Bridle, M. Hunt, Y. Ito, S. Lowe, J. Orloff, B. Peskin, R. Roth, and F. Scattone, "Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech," in Proc. ICASSP'93, 1993, pp. 471-474.
- (1993) Proc. ICASSP'93 , pp. 471-474
- Gillick, L.¹ Baker, J.² Baker, J.³ Bridle, J.⁴ Hunt, M.⁵ Ito, Y.⁶ Lowe, S.⁷ Orloff, J.⁸ Peskin, B.⁹ Roth, R.¹⁰ Scattone, F.¹¹

10
- 85016587886
- SWITCHBOARD: Telephone speech corpus for research and development
- J. J. Godfrey, E. C. Holliman, and J. McDaniel, "SWITCHBOARD: Telephone speech corpus for research and development," in Proc. ICASSP'92, 1992, pp. 517-520.
- (1992) Proc. ICASSP'92 , pp. 517-520
- Godfrey, J.J.¹ Holliman, E.C.² McDaniel, J.³

11
- 0025952278
- An inequality for rational functions with applications to some statistical estimation problems
- P. S. Gopalakrishnan, D. Kanevsky, A. Nadas, and D. Nahamoo, "An inequality for rational functions with applications to some statistical estimation problems," IEEE Trans. Inform. Theory, vol. 37, pp. 107-113, 1991.
- (1991) IEEE Trans. Inform. Theory , vol.37 , pp. 107-113
- Gopalakrishnan, P.S.¹ Kanevsky, D.² Nadas, A.³ Nahamoo, D.⁴

12
- 85153381142
- CU-HTK acoustic modeling experiments
- Linthicum Heights, MD
- T. Hain and P. C. Woodland, "CU-HTK acoustic modeling experiments," in Proc. NIST Hub5 Workshop, Linthicum Heights, MD, 1998.
- (1998) Proc. NIST Hub5 Workshop
- Hain, T.¹ Woodland, P.C.²

13
- 0034847002
- The 1998 HTK system for transcription of conversational telephone speech
- T. Hain, P. C. Woodland, T. R. Niesler, and E. W. D. Whittaker, "The 1998 HTK system for transcription of conversational telephone speech," in Proc. ICASSP'99, 1998, pp. 57-60.
- (1998) Proc. ICASSP'99 , pp. 57-60
- Hain, T.¹ Woodland, P.C.² Niesler, T.R.³ Whittaker, E.W.D.⁴

14
- 0012236195
- The CU-HTK March 2000 Hub5E transcription system
- College Park, MD
- T. Hain, P. C. Woodland, G. Evermann, and D. Povey, "The CU-HTK March 2000 Hub5E transcription system," in Proc. Speech Transcription Workshop, College Park, MD, 2000.
- (2000) Proc. Speech Transcription Workshop
- Hain, T.¹ Woodland, P.C.² Evermann, G.³ Povey, D.⁴

15
- 85153334377
- New features in the CU-HTK system for transcription of conversational telephone speech
- Salt Lake City, UT
- _, "New features in the CU-HTK system for transcription of conversational telephone speech," in Proc. ICASSP'01, Salt Lake City, UT, 1999.
- (1999) Proc. ICASSP'01

16
- 85153373801
- Implicit modeling of pronunciation variation in automatic speech recognition
- to be published
- T. Hain, "Implicit modeling of pronunciation variation in automatic speech recognition," Speech Commun., 2003, to be published.
- (2003) Speech Commun.
- Hain, T.¹

17
- 85123963268
- Improved clustering techniques for class-based statistical language modeling
- Berlin, Germany
- R. Kneser and H. Ney, "Improved clustering techniques for class-based statistical language modeling," in Proc. Eurospeech'93, Berlin, Germany, 1993, pp. 973-976.
- (1993) Proc. Eurospeech'93 , pp. 973-976
- Kneser, R.¹ Ney, H.²

18
- 0003871508
- Ph.D. Thesis, Johns Hopkins Univ., Baltimore, MD
- N. Kumar, "Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition," Ph.D. Thesis, Johns Hopkins Univ., Baltimore, MD, 1997.
- (1997) Investigation of Silicon-auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition
- Kumar, N.¹

19
- 27744552180
- The 2002 NIST RT evaluation speech-to-text results
- A. Le and A. Martin, "The 2002 NIST RT evaluation speech-to-text results," in Proc. Rich Transcription Workshop 2002, 2002.
- (2002) Proc. Rich Transcription Workshop 2002
- Le, A.¹ Martin, A.²

20
- 0029747183
- Speaker normalization using efficient frequency warping procedures
- L. Lee and R. C. Rose, "Speaker normalization using efficient frequency warping procedures," in Proc. ICASSP'96, 1996, pp. 353-356.
- (1996) Proc. ICASSP'96 , pp. 353-356
- Lee, L.¹ Rose, R.C.²

21
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density HMMs
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density HMMs," Comput. Speech Lang., vol. 9, pp. 171-186, 1995.
- (1995) Comput. Speech Lang. , vol.9 , pp. 171-186
- Leggetter, C.J.¹ Woodland, P.C.²

22
- 0001135471
- Flexible speaker adaptation using maximum likelihood linear regression
- Madrid, Spain
- _, "Flexible speaker adaptation using maximum likelihood linear regression," in Proc. Eurospeech'95, Madrid, Spain, 1995, pp. 1155-1158.
- (1995) Proc. Eurospeech'95 , pp. 1155-1158

23
- 85135271674
- Finding consensus among words: Lattice-based word error minimization
- Budapest, Hungary
- L. Mangu, E. Brill, and A. Stolcke, "Finding consensus among words: lattice-based word error minimization," in Proc. Eurospeech'99, Budapest, Hungary, 1999, pp. 495-498.
- (1999) Proc. Eurospeech'99 , pp. 495-498
- Mangu, L.¹ Brill, E.² Stolcke, A.³

24
- 85135152717
- Algorithms for bigram and trigram clustering
- Madrid, Spain
- S. Martin, J. Liermann, and H. Ney, "Algorithms for bigram and trigram clustering," in Proc. Eurospeech'95, Madrid, Spain, 1995, pp. 1253-1256.
- (1995) Proc. Eurospeech'95 , pp. 1253-1256
- Martin, S.¹ Liermann, J.² Ney, H.³

25
- 84902047630
- Single-pass adapted training with all-pass transforms
- Budapest, Hungary
- J. McDonough and W. Byrne, "Single-pass adapted training with all-pass transforms," in Proc. Eurospeech'99, Budapest, Hungary, 1999, pp. 2737-2740.
- (1999) Proc. Eurospeech'99 , pp. 2737-2740
- McDonough, J.¹ Byrne, W.²

26
- 0024076692
- On a model-robust training algorithm for speech recognition
- A. Nadas, D. Nahamoo, and M. A. Picheny, "On a model-robust training algorithm for speech recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 1432-1435, 1988.
- (1988) IEEE Trans. Acoust., Speech, Signal Processing , vol.36 , pp. 1432-1435
- Nadas, A.¹ Nahamoo, D.² Picheny, M.A.³

27
- 0031628780
- Comparison of part-of-speech and automatically derived category-based language models for speech recognition
- Seattle, WA
- T. R. Niesler, E. W. D. Whittaker, and P. C. Woodland, "Comparison of part-of-speech and automatically derived category-based language models for speech recognition," in Proc. ICASSP'98, Seattle, WA, 1998, pp. 177-180.
- (1998) Proc. ICASSP'98 , pp. 177-180
- Niesler, T.R.¹ Whittaker, E.W.D.² Woodland, P.C.³

28
- 27744569337
- [Online]
- The NIST Speech Group. (2000) The 2001 NIST Evaluation Plan for Recognition of Conversational Speech Over the Telephone. [Online] Available: www.nist.gov/speech/tests/ctr/h5_2001/h5-01v1.1.pdf
- (2000) The 2001 NIST Evaluation Plan for Recognition of Conversational Speech over the Telephone

29
- 0026372945
- An Improved MMIE training algorithm for speaker independent, small vocabulary, continuous speech recognition
- Toronto, ON, Canada
- Y. Normandin, "An Improved MMIE training algorithm for speaker independent, small vocabulary, continuous speech recognition," in Proc. ICASSP'91, Toronto, ON, Canada, 1991, pp. 537-540.
- (1991) Proc. ICASSP'91 , pp. 537-540
- Normandin, Y.¹

30
- 0001889147
- A one pass decoder design for large vocabulary recognition
- J. J. Odell, V. Valtchev, P. C. Woodland, and S. J. Young, "A one pass decoder design for large vocabulary recognition," in Proc. 1994 ARPA Human Language Technology Workshop, 1994, pp. 405-410.
- (1994) Proc. 1994 ARPA Human Language Technology Workshop , pp. 405-410
- Odell, J.J.¹ Valtchev, V.² Woodland, P.C.³ Young, S.J.⁴

31
- 0036296863
- Minimum phone error and I-smoothing for improved discriminative training
- Orlando, FL
- D. Povey and P. C. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. ICASSP'02, Orlando, FL, 2002.
- (2002) Proc. ICASSP'02
- Povey, D.¹ Woodland, P.C.²

32
- 4544250610
- Speaker adaptation using lattice-based MLLR
- Sophia Antopolis, Greece
- L. F. Uebel and P. C. Woodland, "Speaker adaptation using lattice-based MLLR," in Proc. ISCA ITRW Adaptation Methods for Speech Recognition, Sophia Antopolis, Greece, 2001.
- (2001) Proc. ISCA ITRW Adaptation Methods for Speech Recognition
- Uebel, L.F.¹ Woodland, P.C.²

33
- 0030643667
- Broadcast news transcription using HTK
- Munich, Germany
- P. C. Woodland, M. J. F. Gales, D. Pye, and S. J. Young, "Broadcast news transcription using HTK," in Proc. ICASSP'97, Munich, Germany, 1997, pp. 719-722.
- (1997) Proc. ICASSP'97 , pp. 719-722
- Woodland, P.C.¹ Gales, M.J.F.² Pye, D.³ Young, S.J.⁴

34
- 0002867698
- Large scale discriminative training for speech recognition
- Paris, France
- P. C. Woodland and D. Povey, "Large scale discriminative training for speech recognition," in Proc. ISCA ITRW ASR2000, Paris, France, 2000, pp. 7-16.
- (2000) Proc. ISCA ITRW ASR2000 , pp. 7-16
- Woodland, P.C.¹ Povey, D.²

35
- 0036567794
- The development of the HTK broadcast news transcription system: An overview
- P. C. Woodland, "The development of the HTK broadcast news transcription system: an overview," Speech Commun., vol. 37, pp. 47-67, 2002.
- (2002) Speech Commun. , vol.37 , pp. 47-67
- Woodland, P.C.¹

36
- 27744573215
- CU-HTK April 2002 switchboard system
- Vienna, VA
- P. C. Woodland, G. Evermann, M. J. F. Gales, T. Hain, X. L. Liu, G. L. Moore, D. Povey, and L. Wang, "CU-HTK April 2002 switchboard system," in Proc. Rich Transcription Workshop, Vienna, VA, 2002.
- (2002) Proc. Rich Transcription Workshop
- Woodland, P.C.¹ Evermann, G.² Gales, M.J.F.³ Hain, T.⁴ Liu, X.L.⁵ Moore, G.L.⁶ Povey, D.⁷ Wang, L.⁸

37
- 0003822743
- Cambridge, U.K.: Cambridge Univ. Press
- S. J. Young, G. Evermann, T. Hain, D. Kershaw, G. L. Moore, J. J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. C. Woodland, The HTK Book. Cambridge, U.K.: Cambridge Univ. Press, 2003.
- (2003) The HTK Book
- Young, S.J.¹ Evermann, G.² Hain, T.³ Kershaw, D.⁴ Moore, G.L.⁵ Odell, J.J.⁶ Ollason, D.⁷ Povey, D.⁸ Valtchev, V.⁹ Woodland, P.C.¹⁰

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.