SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 22, Issue 3, 2014, Pages 711-726

Theoretical analysis of diversity in an ensemble of automatic speech recognition systems

(4) Audhkhasi, Kartik a Zavou, Andreas M b Georgiou, Panayiotis G a Narayanan, Shrikanth S a

a University of Southern California ^* (United States)

b CYPRUS UNIVERSITY OF TECHNOLOGY (Cyprus)

Author keywords

Ambiguity decomposition; Automatic speech recognition; Discriminative training; Diversity; Ensemble methods; ROVER; System combination

Indexed keywords

AMBIGUITY DECOMPOSITION; AUTOMATIC SPEECH RECOGNITION; DISCRIMINATIVE TRAINING; DIVERSITY; ENSEMBLE METHODS; ROVER; SYSTEM COMBINATION;

ALGORITHMS; ECONOMIC AND SOCIAL EFFECTS;

SPEECH RECOGNITION;

EID: 84898080333 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASLP.2014.2303295 Document Type: Article

Times cited : (17)

References (65)

1
- 84906226217
- Empirical link between hypothesis diversity and fusion performance in an ensemble of automatic speech recognition systems
- K. Audhkhasi, A. M. Zavou, P. G. Georgiou, and S. S. Narayanan, "Empirical link between hypothesis diversity and fusion performance in an ensemble of automatic speech recognition systems," in Proc. Interspeech, 2013.
- Proc. Interspeech, 2013
- Audhkhasi, K.¹ Zavou, A.M.² Georgiou, P.G.³ Narayanan, S.S.⁴

2
- 0004244302
- Englewood Cliffs, NJ, USA: Prentice-Hall
- L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition. Englewood Cliffs, NJ, USA: Prentice-Hall, 1993.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.¹ Juang, B.-H.²

3
- 85008006725
- Advances in Arabic speech transcription at IBM under the DARPA GALE program
- Jul.
- H. Soltau, G. Saon, B. Kingsbury, H. K. J. Kuo, L. Mangu, D. Povey, and A. Emami, "Advances in Arabic speech transcription at IBM under the DARPA GALE program," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 5, pp. 884-894, Jul. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.5 , pp. 884-894
- Soltau, H.¹ Saon, G.² Kingsbury, B.³ Kuo, H.K.J.⁴ Mangu, L.⁵ Povey, D.⁶ Emami, A.⁷

4
- 56149114865
- The BBN 2007Displayless English/Iraqi Speech-to-Speech Translation System
- D. Stallard, F. Choi, C. L. Kao, K. Krstovski, P. Natarajan, R. Prasad, S. Saleem, and K. Subramanian, "The BBN 2007Displayless English/Iraqi Speech-to-Speech Translation System," in Proc. Interspeech, 2007.
- Proc. Interspeech, 2007
- Stallard, D.¹ Choi, F.² Kao, C.L.³ Krstovski, K.⁴ Natarajan, P.⁵ Prasad, R.⁶ Saleem, S.⁷ Subramanian, K.⁸

5
- 34047266376
- Advances in speech transcription at IBM under the DARPA EARS program
- Sep.
- S. F. Chen, B. Kingsbury, L. Mangu, D. Povey, G. Saon, H. Soltau, and G. Zweig, "Advances in speech transcription at IBM under the DARPA EARS program," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1596-1608, Sep. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.5 , pp. 1596-1608
- Chen, S.F.¹ Kingsbury, B.² Mangu, L.³ Povey, D.⁴ Saon, G.⁵ Soltau, H.⁶ Zweig, G.⁷

6
- 67649528017
- The CALO meeting speech recognition and understanding system
- G. Tur, A. Stolcke, L. Voss, J. Dowding, B. Favre, R. Fernandez, M. Frampton, M. Frandsen, C. Frederickson, and M. Graciarena, "The CALO meeting speech recognition and understanding system," Proc. IEEE SLT, pp. 69-72, 2008.
- (2008) Proc. IEEE SLT , pp. 69-72
- Tur, G.¹ Stolcke, A.² Voss, L.³ Dowding, J.⁴ Favre, B.⁵ Fernandez, R.⁶ Frampton, M.⁷ Frandsen, M.⁸ Frederickson, C.⁹ Graciarena, M.¹⁰

7
- 84890507010
- Developing speech recognition systems for corpus indexing under the IARPA BABEL program
- J. Cui, X. Cui, B. Ramabhadran, J. Kim, B. Kingsbury, J. Mamou, L. Mangu, M. Picheny, T. N. Sainath, and A. Sethy, "Developing speech recognition systems for corpus indexing under the IARPA BABEL program," in Proc. ICASSP, 2013, pp. 6353-6357.
- Proc. ICASSP, 2013 , pp. 6353-6357
- Cui, J.¹ Cui, X.² Ramabhadran, B.³ Kim, J.⁴ Kingsbury, B.⁵ Mamou, J.⁶ Mangu, L.⁷ Picheny, M.⁸ Sainath, T.N.⁹ Sethy, A.¹⁰

8
- 0030638031
- A post processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)
- J. Fiscus, "A post processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)," in Proc. ASRU, 1997, pp. 347-354.
- Proc. ASRU, 1997 , pp. 347-354
- Fiscus, J.¹

9
- 0032639912
- Using boosting to improve a hybrid HMM/neural network speech recognizer
- H. Schwenk, "Using boosting to improve a hybrid HMM/neural network speech recognizer," in Proc. IEEE ICASSP, 1999, vol. 2, pp. 1009-1012.
- Proc. IEEE ICASSP, 1999 , vol.2 , pp. 1009-1012
- Schwenk, H.¹

10
- 0030351194
- Boosting the performance of connectionist large vocabulary speech recognition
- G. Cook and T. Robinson, "Boosting the performance of connectionist large vocabulary speech recognition," in Proc. ICSLP, 1996, vol. 3, pp. 1305-1308.
- Proc. ICSLP, 1996 , vol.3 , pp. 1305-1308
- Cook, G.¹ Robinson, T.²

11
- 4544236424
- Boosting HMMs with an application to speech recognition
- C. Dimitrakakis and S. Bengio, "Boosting HMMs with an application to speech recognition," in Proc. ICASSP, 2004, vol. 5, pp. 618-621.
- Proc. ICASSP, 2004 , vol.5 , pp. 618-621
- Dimitrakakis, C.¹ Bengio, S.²

12
- 33646818291
- Constructing ensembles of ASR systems using randomized decision trees
- O. Siohan, B. Ramabhadran, and B. Kingsbury, "Constructing ensembles of ASR systems using randomized decision trees," in Proc. ICASSP, 2005, vol. 1, pp. 197-200.
- Proc. ICASSP, 2005 , vol.1 , pp. 197-200
- Siohan, O.¹ Ramabhadran, B.² Kingsbury, B.³

13
- 51549086717
- Random forests of phonetic decision trees for acoustic modeling in conversational speech recognition
- Mar.
- J. Xue and Y. Zhao, "Random forests of phonetic decision trees for acoustic modeling in conversational speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 3, pp. 519-528, Mar. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.3 , pp. 519-528
- Xue, J.¹ Zhao, Y.²

14
- 80055092534
- Boosting systems for large vocabulary continuous speech recognition
- G. Saon and H. Soltau, "Boosting systems for large vocabulary continuous speech recognition," Speech Commun., vol. 54, no. 2, pp. 212-218, 2012.
- (2012) Speech Commun. , vol.54 , Issue.2 , pp. 212-218
- Saon, G.¹ Soltau, H.²

15
- 0030211964
- Bagging predictors
- L. Breiman, "Bagging predictors," Mach. Learn., vol. 24, no. 2, pp. 123-140, 1996.
- (1996) Mach. Learn. , vol.24 , Issue.2 , pp. 123-140
- Breiman, L.¹

16
- 84983110889
- A decision-theoretic generalization of on-line learning and an application to boosting
- New York, NY, USA: Springer
- Y. Freund and R. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting," in Computational learning theory. New York, NY, USA: Springer, 1995, pp. 23-37.
- (1995) Computational Learning Theory , pp. 23-37
- Freund, Y.¹ Schapire, R.²

17
- 0035470889
- Greedy function approximation: A gradient boosting machine
- J. H. Friedman, "Greedy function approximation: A gradient boosting machine," Ann. Statist., vol. 29, no. 5, pp. 1189-1232, 2001.
- (2001) Ann. Statist. , vol.29 , Issue.5 , pp. 1189-1232
- Friedman, J.H.¹

18
- 0035478854
- Random forests
- L. Breiman, "Random forests," Mach. Learn., vol. 45, no. 1, pp. 5-32, 2001.
- (2001) Mach. Learn. , vol.45 , Issue.1 , pp. 5-32
- Breiman, L.¹

19
- 79955063796
- Ph.D. dissertation, Cambridge Univ. Engi. Dept. and Darwin College, Cambridge, U.K.
- C. Breslin, "Generation and combination of complementary systems for automatic speech recognition," Ph.D. dissertation, Cambridge Univ. Engi. Dept. and Darwin College, Cambridge, U.K., 2008.
- (2008) Generation and Combination of Complementary Systems for Automatic Speech Recognition
- Breslin, C.¹

20
- 84860878023
- Multi-view and multi-objective semi-supervised learning for HMM-based automatic speech recognition
- Sep.
- X. Cui, J. Huang, and J.-T. Chien, "Multi-view and multi-objective semi-supervised learning for HMM-based automatic speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 7, pp. 1923-1935, Sep. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.7 , pp. 1923-1935
- Cui, X.¹ Huang, J.² Chien, J.-T.³

21
- 84872174281
- Building acoustic model ensembles by data sampling with enhanced trainings and features
- Mar.
- X. Chen and Y. Zhao, "Building acoustic model ensembles by data sampling with enhanced trainings and features," IEEE Trans. Audio, Speech, Language Process., vol. 21, no. 3, pp. 498-507, Mar. 2013.
- (2013) IEEE Trans. Audio, Speech, Language Process. , vol.21 , Issue.3 , pp. 498-507
- Chen, X.¹ Zhao, Y.²

22
- 14644422971
- New York, NY, USA: Wiley-Interscience
- L. I. Kuncheva, Combining pattern classifiers: methods and algorithms. New York, NY, USA: Wiley-Interscience, 2004.
- (2004) Combining Pattern Classifiers: Methods and Algorithms
- Kuncheva, L.I.¹

23
- 80053403826
- Ensemble methods inmachine learning
- T. Dietterich, "Ensemble methods inmachine learning," Multiple Classifier Syst., pp. 1-15, 2000.
- (2000) Multiple Classifier Syst. , pp. 1-15
- Dietterich, T.¹

24
- 85054435084
- Neural network ensembles, cross validation, and active learning
- A. Krogh and J. Vedelsby, "Neural network ensembles, cross validation, and active learning," Adv. Neural Inf. Process. Syst., pp. 231-238, 1995.
- (1995) Adv. Neural Inf. Process. Syst. , pp. 231-238
- Krogh, A.¹ Vedelsby, J.²

25
- 0029727747
- Generalization error of ensemble estimators
- N. Ueda and R. Nakano, "Generalization error of ensemble estimators," in Proc. IEEE Int. Conf. Neural Netw., 1996, vol. 1, pp. 90-95.
- Proc. IEEE Int. Conf. Neural Netw., 1996 , vol.1 , pp. 90-95
- Ueda, N.¹ Nakano, R.²

26
- 0030085913
- Analysis of decision boundaries in linearly combined neural classifiers
- DOI 10.1016/0031-3203(95)00085-2
- K. Tumer and J. Ghosh, "Analysis of decision boundaries in linearly combined neural classifiers," Pattern Recogn., vol. 29, no. 2, pp. 341-348, 1996. (Pubitemid 126397840)
- (1996) Pattern Recognition , vol.29 , Issue.2 , pp. 341-348
- Tumer, K.¹ Ghosh, J.²

27
- 0033485370
- Ensemble learning via negative correlation
- Y. Liu and X. Yao, "Ensemble learning via negative correlation," Neural Netw., vol. 12, no. 10, pp. 1399-1404, 1999.
- (1999) Neural Netw. , vol.12 , Issue.10 , pp. 1399-1404
- Liu, Y.¹ Yao, X.²

28
- 0001942829
- Neural networks and the bias/variance dilemma
- S. Geman, E. Bienenstock, and R. Doursat, "Neural networks and the bias/variance dilemma," Neural Comput., vol. 4, no. 1, pp. 1-58, 1992.
- (1992) Neural Comput. , vol.4 , Issue.1 , pp. 1-58
- Geman, S.¹ Bienenstock, E.² Doursat, R.³

29
- 84874281338
- The Kaldi Speech Recognition Toolkit
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, and P. Schwarz, "The Kaldi Speech Recognition Toolkit," in Proc. ASRU, Dec. 2011.
- Proc. ASRU, Dec. 2011
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰

30
- 25444533246
- San Rafael, CA, USA: Morgan Kaufmann
- J. Laferty, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. San Rafael, CA, USA: Morgan Kaufmann, 2001, pp. 282-289.
- (2001) Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , pp. 282-289
- Laferty, J.¹

31
- 77949367510
- Self-supervised discriminative training of statistical language models
- P. Xu, D. Karakos, and S. Khudanpur, "Self-supervised discriminative training of statistical language models," in Proc. ASRU, 2009, pp. 317-322.
- Proc. ASRU, 2009 , pp. 317-322
- Xu, P.¹ Karakos, D.² Khudanpur, S.³

32
- 0037403516
- Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy
- L. I. Kuncheva and C. J. Whitaker, "Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy," Mach. Learn., vol. 51, no. 2, pp. 181-207, 2003.
- (2003) Mach. Learn. , vol.51 , Issue.2 , pp. 181-207
- Kuncheva, L.I.¹ Whitaker, C.J.²

33
- 0036609602
- Relationships between combination methods and measures of diversity in combining classifiers
- C. A. Shipp and L. I. Kuncheva, "Relationships between combination methods and measures of diversity in combining classifiers," Inf. Fusion, vol. 3, no. 2, pp. 135-148, 2002.
- (2002) Inf. Fusion , vol.3 , Issue.2 , pp. 135-148
- Shipp, C.A.¹ Kuncheva, L.I.²

34
- 0003822743
- Cambridge, U.K.: Cambridge Univ. Eng. Dept.
- S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK book. Cambridge, U.K.: Cambridge Univ. Eng. Dept., 2002.
- (2002) The HTK Book
- Young, S.¹ Evermann, G.² Kershaw, D.³ Moore, G.⁴ Odell, J.⁵ Ollason, D.⁶ Povey, D.⁷ Valtchev, V.⁸ Woodland, P.⁹

35
- 0025235788
- Overview of the SPHINX speech recognition system
- DOI 10.1109/29.45616
- K.-F. Lee, H.-W. Hon, and R. Reddy, "An overview of the SPHINX speech recognition system," IEEE Trans. Acoust., Speech, Signal Process., vol. 38, no. 1, pp. 35-45, Jan. 1990. (Pubitemid 20665377)
- (1990) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.38 , Issue.1 , pp. 35-45
- Lee K.-Fu¹ Hon H.-Wuen² Reddy, R.³

36
- 38149133882
- OpenFST: A general and efficient weighted finite-state transducer library
- C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri, "OpenFST: A general and efficient weighted finite-state transducer library," Implement. Applicat. Automata, pp. 11-23, 2007.
- (2007) Implement. Applicat. Automata , pp. 11-23
- Allauzen, C.¹ Riley, M.² Schalkwyk, J.³ Skut, W.⁴ Mohri, M.⁵

37
- 0012330750
- The design for the Wall Street Journal-based CSR corpus
- Association for Computational Linguistics
- D. B. Paul and J. M. Baker, "The design for the Wall Street Journal-based CSR corpus," in Proc. Workshop Speech Natural Lang., 1992, pp. 357-362, Association for Computational Linguistics.
- Proc. Workshop Speech Natural Lang., 1992 , pp. 357-362
- Paul, D.B.¹ Baker, J.M.²

38
- 84865780426
- English broadcast news speech (HUB4)
- J. Fiscus, J. Garofolo, M. Przybocki, W. Fisher, and D. Pallett, "English broadcast news speech (HUB4)," Linguist. Data Consortium, Philadelphia, 1997.
- (1997) Linguist. Data Consortium, Philadelphia
- Fiscus, J.¹ Garofolo, J.² Przybocki, M.³ Fisher, W.⁴ Pallett, D.⁵

39
- 0141814662
- The ICSI meeting corpus
- A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, and A. Stolcke, "The ICSI meeting corpus," in Proc. ICASSP, 2003, vol. 1, pp. 364-367.
- Proc. ICASSP, 2003 , vol.1 , pp. 364-367
- Janin, A.¹ Baron, D.² Edwards, J.³ Ellis, D.⁴ Gelbart, D.⁵ Morgan, N.⁶ Peskin, B.⁷ Pfau, T.⁸ Shriberg, E.⁹ Stolcke, A.¹⁰

40
- 4544265717
- Ph.D. dissertation, Cambridge Univ., Cambridge, U.K.
- D. Povey, "Discriminative training for large vocabulary speech recognition," Ph.D. dissertation, Cambridge Univ., Cambridge, U.K., 2003.
- (2003) Discriminative Training for Large Vocabulary Speech Recognition
- Povey, D.¹

41
- 15844411850
- Confidence measures for speech recognition: A survey
- DOI 10.1016/j.specom.2004.12.004, PII S0167639305000051
- H. Jiang, "Confidence measures for speech recognition: A survey," Speech Commun., vol. 45, no. 4, pp. 455-470, 2005. (Pubitemid 40423290)
- (2005) Speech Communication , vol.45 , Issue.4 , pp. 455-470
- Jiang, H.¹

42
- 0022594196
- An introduction to hidden Markov models
- Jan.
- L. Rabiner and B. Juang, "An introduction to hidden Markov models," IEEE ASSP Mag., vol. 3, no. 1, pp. 4-16, Jan. 1986.
- (1986) IEEE ASSP Mag. , vol.3 , Issue.1 , pp. 4-16
- Rabiner, L.¹ Juang, B.²

43
- 85135146711
- Estimating confidence using word lattices
- T. Kemp and T. Schaaf, "Estimating confidence using word lattices," in Proc. Eurospeech, Rhodes, Greece, 1997, vol. 2, pp. 827-830.
- Proc. Eurospeech, Rhodes, Greece, 1997 , vol.2 , pp. 827-830
- Kemp, T.¹ Schaaf, T.²

44
- 0035278951
- Confidence measures for large vocabulary continuous speech recognition
- DOI 10.1109/89.906002, PII S1063667601013281
- F. Wessel, R. Schluter, K. Macherey, and H. Ney, "Confidence measures for large vocabulary continuous speech recognition," IEEE Trans. Speech Audio Process., vol. 9, no. 3, pp. 288-298, Mar. 2001. (Pubitemid 32286598)
- (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.3 , pp. 288-298
- Wessel, F.¹ Schluter, R.² Macherey, K.³ Ney, H.⁴

45
- 34547538178
- Maximum entropy confidence estimation for speech recognition
- C. White, J. Droppo, A. Acero, and J. Odell, "Maximum entropy confidence estimation for speech recognition," in Proc. ICASSP, 2007, vol. 4, pp. 809-812.
- Proc. ICASSP, 2007 , vol.4 , pp. 809-812
- White, C.¹ Droppo, J.² Acero, A.³ Odell, J.⁴

46
- 84865793084
- Combining information sources for confidence estimation with CRF models
- M. S. Seigel and P. C. Woodland, "Combining information sources for confidence estimation with CRF models," in Proc. INTERSPEECH, 2011, pp. 905-908.
- (2011) Proc. Interspeech , pp. 905-908
- Seigel, M.S.¹ Woodland, P.C.²

47
- 85009128674
- A boosting approach for confidence scoring
- P. J. Moreno, B. Logan, and B. Raj, "A boosting approach for confidence scoring," in Proc. 7th Eur. Conf. Speech Commun. Technol., 2001.
- Proc. 7th Eur. Conf. Speech Commun. Technol., 2001
- Moreno, P.J.¹ Logan, B.² Raj, B.³

48
- 0030706666
- Neural-network based measures of confidence for word recognition
- M. Weintraub, F. Beaufays, Z. Rivlin, Y. Konig, and A. Stolcke, "Neural-network based measures of confidence for word recognition," in Proc. ICASSP, 1997, vol. 2, pp. 887-890.
- Proc. ICASSP, 1997 , vol.2 , pp. 887-890
- Weintraub, M.¹ Beaufays, F.² Rivlin, Z.³ Konig, Y.⁴ Stolcke, A.⁵

49
- 0142192295
- Conditional random fields: Probabilistic models for segmenting and labeling sequence data
- J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in Proc. ICML-01, 2001, pp. 282-289.
- Proc. ICML-01, 2001 , pp. 282-289
- Lafferty, J.¹ McCallum, A.² Pereira, F.³

50
- 33646887390
- On the limited memory BFGS method for large scale optimization
- D. C. Liu and J. Nocedal, "On the limited memory BFGS method for large scale optimization," Math. Program., vol. 45, no. 1-3, pp. 503-528, 1989. (Pubitemid 20660315)
- (1989) Mathematical Programming, Series B , vol.45 , Issue.3 , pp. 503-528
- Liu, D.C.¹ Nocedal, J.²

51
- 78650977476
- Opensmile: The Munich versatile and fast open-source audio feature extractor
- ACM
- F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile: The Munich versatile and fast open-source audio feature extractor," in Proc. Int. Conf. Multimedia, 2010, pp. 1459-1462, ACM.
- Proc. Int. Conf. Multimedia, 2010 , pp. 1459-1462
- Eyben, F.¹ Wöllmer, M.² Schuller, B.³

52
- 85029930138
- Predicting automatic speech recognition performance using prosodic cues
- Association for Computational Linguistics
- D. J. Litman, J. B. Hirschberg, and M. Swerts, "Predicting automatic speech recognition performance using prosodic cues," in Proc. 1st North Amer. Chap. Assoc. Comput. Linguist. Conf., 2000, pp. 218-225, Association for Computational Linguistics.
- Proc. 1st North Amer. Chap. Assoc. Comput. Linguist. Conf., 2000 , pp. 218-225
- Litman, D.J.¹ Hirschberg, J.B.² Swerts, M.³

53
- 73649124909
- Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates
- S. Goldwater, D. Jurafsky, and C. D. Manning, "Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates," Speech Commun., vol. 52, no. 3, pp. 181-200, 2010.
- (2010) Speech Commun. , vol.52 , Issue.3 , pp. 181-200
- Goldwater, S.¹ Jurafsky, D.² Manning, C.D.³

54
- 2942568545
- Prosodic and other cues to speech recognition failures
- J. Hirschberg, D. Litman, and M. Swerts, "Prosodic and other cues to speech recognition failures," Speech Commun., vol. 43, no. 1, pp. 155-175, 2004.
- (2004) Speech Commun. , vol.43 , Issue.1 , pp. 155-175
- Hirschberg, J.¹ Litman, D.² Swerts, M.³

55
- 85009223733
- Automatic disfluency identification in conversational speech using multiple knowledge sources
- Y. Liu, E. Shriberg, and A. Stolcke, "Automatic disfluency identification in conversational speech using multiple knowledge sources," in Proc. Eurospeech, Geneva, Switzerland, 2003, vol. 1, pp. 957-960.
- Proc. Eurospeech, Geneva, Switzerland, 2003 , vol.1 , pp. 957-960
- Liu, Y.¹ Shriberg, E.² Stolcke, A.³

56
- 84858389003
- Contextual information improves OOV detection in speech
- C. Parada, M. Dredze, D. Filimonov, and F. Jelinek, "Contextual information improves OOV detection in speech," in Proc. NAACL, 2010.
- Proc. NAACL, 2010
- Parada, C.¹ Dredze, M.² Filimonov, D.³ Jelinek, F.⁴

57
- 38149013231
- An improved hierarchical speaker clustering
- W. Wang, P. Lu, and Y. Yan, "An improved hierarchical speaker clustering," Acta Acoustica, 2006.
- (2006) Acta Acoustica
- Wang, W.¹ Lu, P.² Yan, Y.³

58
- 84891308106
- SRILM - an extensible language modeling toolkit
- A. Stolcke, "SRILM - an extensible language modeling toolkit," in Proc. ICSLP, 2002, pp. 901-904.
- (2002) Proc. ICSLP , pp. 901-904
- Stolcke, A.¹

59
- 84878590630
- Complementary phone error training
- F. Diehl and P. C. Woodland, "Complementary phone error training," in Proc. Interspeech, 2012.
- Proc. Interspeech, 2012
- Diehl, F.¹ Woodland, P.C.²

60
- 44949249226
- Generating complementary systems for speech recognition
- C. Breslin and M. J. F. Gales, "Generating complementary systems for speech recognition," in Proc. Interspeech, 2006.
- Proc. Interspeech, 2006
- Breslin, C.¹ Gales, M.J.F.²

61
- 84893695671
- Discriminative training of acoustic models for system combination
- Y. Tachioka and S. Watanabe, "Discriminative training of acoustic models for system combination," in Proc. Interspeech, 2013.
- Proc. Interspeech, 2013
- Tachioka, Y.¹ Watanabe, S.²

62
- 29444447644
- Lattice segmentation and minimum Bayes risk discriminative training for large vocabulary continuous speech recognition
- DOI 10.1016/j.specom.2005.07.002, PII S016763930500172X
- V. Doumpiotis and W. Byrne, "Lattice segmentation and minimum Bayes risk discriminative training for large vocabulary continuous speech recognition," Speech Commun., vol. 48, no. 2, pp. 142-160, 2006. (Pubitemid 43012028)
- (2006) Speech Communication , vol.48 , Issue.2 , pp. 142-160
- Doumpiotis, V.¹ Byrne, W.²

63
- 85032751713
- Discriminative training for automatic speech recognition: Modeling, criteria, optimization, implementation, and performance
- Nov.
- G. Heigold, H. Ney, R. Schluter, and S. Wiesler, "Discriminative training for automatic speech recognition: Modeling, criteria, optimization, implementation, and performance," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 58-69, Nov. 2012.
- (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 58-69
- Heigold, G.¹ Ney, H.² Schluter, R.³ Wiesler, S.⁴

64
- 0141480019
- Discriminative MAP for acoustic model adaptation
- D. Povey, P. C. Woodland, and M. J. F. Gales, "Discriminative MAP for acoustic model adaptation," in Proc. ICASSP, 2003, vol. 1, pp. 312-315.
- (2003) Proc. ICASSP , vol.1 , pp. 312-315
- Povey, D.¹ Woodland, P.C.² Gales, M.J.F.³

65
- 84867593677
- Creating ensemble of diverse maximum entropy models
- K. Audhkhasi, A. Sethy, B. Ramabhadran, and S. S. Narayanan, "Creating ensemble of diverse maximum entropy models," in Proc. ICASSP, 2012, pp. 4845-4848.
- (2012) Proc. ICASSP , pp. 4845-4848
- Audhkhasi, K.¹ Sethy, A.² Ramabhadran, B.³ Narayanan, S.S.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.