SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2013, Pages 3082-3086

Empirical link between hypothesis diversity and fusion performance in an ensemble of automatic speech recognition systems

(4) Audhkhasi, Kartik a Zavou, Andreas M a Georgiou, Panayiotis G a Narayanan, Shrikanth S a

a University of Southern California (United States)

Author keywords

Automatic speech recognition; Diversity; Ensemble methods; ROVER; System combination

Indexed keywords

COMPUTER APPLICATIONS; COMPUTER SIMULATION;

AUTOMATIC SPEECH RECOGNITION; DIVERSITY; ENSEMBLE METHODS; ROVER; SYSTEM COMBINATION;

SPEECH RECOGNITION;

EID: 84906226217 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (9)

References (43)

1
- 85008006725
- Advances in Arabic speech transcription at IBM under the DARPA gale program
- H. Soltau et al., "Advances in Arabic speech transcription at IBM under the DARPA GALE program, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 5, pp. 884-894, 2009.
- (2009) IEEE Transactions on Audio, Speech, and Language Processing , vol.17 , Issue.5 , pp. 884-894
- Soltau, H.¹

2
- 51449101515
- The BBN 2007 displayless English/Iraqi speech-to-speech translation system
- D. Stallard et al., "The BBN 2007 Displayless English/Iraqi Speech-to-Speech Translation System, " in Proc. Interspeech, 2007.
- (2007) Proc. Interspeech
- Stallard, D.¹

3
- 34047266376
- Advances in speech transcription at IBM under the DARPA EARS program
- S. F. Chen et al., "Advances in speech transcription at IBM under the DARPA EARS program, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1596-1608, 2006.
- (2006) IEEE Transactions on Audio, Speech, and Language Processing , vol.14 , Issue.5 , pp. 1596-1608
- Chen, S.F.¹

4
- 67649528017
- The CALO meeting speech recognition and understanding system
- G. Tur et al., "The CALO meeting speech recognition and understanding system, " in Proc. SLT. IEEE, 2008, pp. 69-72.
- (2008) Proc. SLT. IEEE , pp. 69-72
- Tur, G.¹

5
- 14644422971
- Wiley-Interscience
- L. I. Kuncheva, Combining pattern classifiers: methods and algorithms, Wiley-Interscience, 2004.
- (2004) Combining Pattern Classifiers: Methods and Algorithms
- Kuncheva, L.I.¹

6
- 80053403826
- Ensemble methods in machine learning
- T. Dietterich, "Ensemble methods in machine learning, " Multiple classifier systems, pp. 1-15, 2000.
- (2000) Multiple Classifier Systems , pp. 1-15
- Dietterich, T.¹

7
- 85054435084
- Neural network ensembles, cross validation, and active learning
- A. Krogh and J. Vedelsby, "Neural network ensembles, cross validation, and active learning, " Advances in neural information processing systems, pp. 231-238, 1995.
- (1995) Advances in Neural Information Processing Systems , pp. 231-238
- Krogh, A.¹ Vedelsby, J.²

8
- 0029727747
- Generalization error of ensemble estimators
- N. Ueda and R. Nakano, "Generalization error of ensemble estimators, " in IEEE International Conference on Neural Networks, 1996, vol. 1, pp. 90-95.
- (1996) IEEE International Conference on Neural Networks , vol.1 , pp. 90-95
- Ueda, N.¹ Nakano, R.²

9
- 0030085913
- Analysis of decision boundaries in linearly combined neural classifiers
- K. Tumer and J. Ghosh, "Analysis of decision boundaries in linearly combined neural classifiers, " Pattern Recognition, vol. 29, no. 2, pp. 341-348, 1996.
- (1996) Pattern Recognition , vol.29 , Issue.2 , pp. 341-348
- Tumer, K.¹ Ghosh, J.²

10
- 0032639912
- Using boosting to improve a hybrid HMM/neural network speech recognizer
- H. Schwenk, "Using boosting to improve a hybrid HMM/neural network speech recognizer, " in Proc. ICASSP IEEE, 1999, vol. 2, pp. 1009-1012.
- (1999) Proc. ICASSP IEEE , vol.2 , pp. 1009-1012
- Schwenk, H.¹

11
- 0030351194
- Boosting the performance of connectionist large vocabulary speech recognition
- G. Cook and T. Robinson, "Boosting the performance of connectionist large vocabulary speech recognition, " in Proc. ICSLP IEEE, 1996, vol. 3, pp. 1305-1308.
- (1996) Proc. ICSLP IEEE , vol.3 , pp. 1305-1308
- Cook, G.¹ Robinson, T.²

12
- 4544236424
- Boosting HMMs with an application to speech recognition
- C. Dimitrakakis and S. Bengio, "Boosting HMMs with an application to speech recognition, " in Proc. ICASSP. IEEE, 2004, vol. 5, pp. 618-621.
- (2004) Proc. ICASSP IEEE , vol.5 , pp. 618-621
- Dimitrakakis, C.¹ Bengio, S.²

13
- 33646818291
- Contructing ensembles of ASR systems using randomized decision trees
- O. Siohan, B. Ramabhadran, and B. Kingsbury, "Contructing ensembles of ASR systems using randomized decision trees, " in Proc. ICASSP IEEE, 2005, vol. 1, pp. 197-200.
- (2005) Proc. ICASSP IEEE , vol.1 , pp. 197-200
- Siohan, O.¹ Ramabhadran, B.² Kingsbury, B.³

14
- 80055092534
- Boosting systems for large vocabulary continuous speech recognition
- G. Saon and H. Soltau, "Boosting systems for large vocabulary continuous speech recognition, " Speech Communication, vol. 54, no. 2, pp. 212-218, 2012.
- (2012) Speech Communication , vol.54 , Issue.2 , pp. 212-218
- Saon, G.¹ Soltau, H.²

15
- 0030211964
- Bagging predictors
- L. Breiman, "Bagging predictors, " Machine Learning, vol. 24, no. 2, pp. 123-140, 1996.
- (1996) Machine Learning , vol.24 , Issue.2 , pp. 123-140
- Breiman, L.¹

16
- 84983110889
- A decision-theoretic generalization of on-line learning and an application to boosting
- Springer
- Y. Freund and R. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting, " in Computational learning theory. Springer, 1995, pp. 23-37.
- (1995) Computational Learning Theory , pp. 23-37
- Freund, Y.¹ Schapire, R.²

17
- 0035470889
- Greedy function approximation: A gradient boosting machine
- J. H. Friedman, "Greedy function approximation: A gradient boosting machine, " Ann. Statistics, vol. 29, no. 5, pp. 1189-1232, 2001.
- (2001) Ann. Statistics , vol.29 , Issue.5 , pp. 1189-1232
- Friedman, J.H.¹

18
- 0035478854
- Random forests
- L. Breiman, "Random forests, " Machine learning, vol. 45, no. 1, pp. 5-32, 2001.
- (2001) Machine Learning , vol.45 , Issue.1 , pp. 5-32
- Breiman, L.¹

19
- 79955063796
- Ph.D. thesis, Cambridge University Engineering Department and Darwin College
- C. Breslin, Generation and combination of complementary systems for automatic speech recognition, Ph.D. thesis, Cambridge University Engineering Department and Darwin College, 2008.
- (2008) Generation and Combination of Complementary Systems for Automatic Speech Recognition
- Breslin, C.¹

20
- 84867596093
- Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation
- K. Audhkhasi, P. G. Georgiou, and S.S. Narayanan, "Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation, " in Proc. ICASSP, 2012.
- (2012) Proc. ICASSP
- Audhkhasi, K.¹ Georgiou, P.G.² Narayanan, S.S.³

21
- 84865764400
- Reliability weighted acoustic model adaptation using crowd-sourced transcriptions
- K. Audhkhasi, P. G. Georgiou, and S. S. Narayanan, "Reliability weighted acoustic model adaptation using crowd-sourced transcriptions, " in Proc. Interspeech, 2011.
- (2011) Proc. Interspeech
- Audhkhasi, K.¹ Georgiou, P.G.² Narayanan, S.S.³

22
- 80051628698
- Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics
- K. Audhkhasi, P. G. Georgiou, and S. S. Narayanan, "Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics, " in Proc. ICASSP, 2011.
- (2011) Proc. ICASSP
- Audhkhasi, K.¹ Georgiou, P.G.² Narayanan, S.S.³

23
- 84865734254
- Speaking to the crowd: Looking at past achievements in using crowdsourcing for speech and predicting future challenges
- G. Parent and M. Eskenazi, "Speaking to the crowd: looking at past achievements in using crowdsourcing for speech and predicting future challenges, " in Proc. Interspeech, 2011.
- (2011) Proc. Interspeech
- Parent, G.¹ Eskenazi, M.²

24
- 79959821909
- Automatic estimation of transcription accuracy and difficulty
- B. C. Roy, S. Vasoughi, and D. Roy, "Automatic estimation of transcription accuracy and difficulty, " in Proc. Interspeech ISCA, 2010, pp. 1902-1905.
- (2010) Proc. Interspeech ISCA , pp. 1902-1905
- Roy, B.C.¹ Vasoughi, S.² Roy, D.³

25
- 78049407752
- Using the amazon mechanical turk for transcription of spoken language
- M. Marge, S. Banerjee, and A. I. Rudnicky, "Using the Amazon Mechanical Turk for transcription of spoken language, " in Proc. ICASSP, 2010.
- (2010) Proc. ICASSP
- Marge, M.¹ Banerjee, S.² Rudnicky, A.I.³

26
- 79958275518
- Cheap, fast and good enough: Automatic speech recognition with non-expert transcription
- S. Novotney and C. Callison-Burch, "Cheap, fast and good enough: Automatic speech recognition with non-expert transcription, " in Proc. NAACL-HLT, 2010.
- (2010) Proc. NAACL-HLT
- Novotney, S.¹ Callison-Burch, C.²

27
- 84858953642
- The kaldi speech recognition toolkit
- Dec., IEEE
- D. Povey et al., "The Kaldi Speech Recognition Toolkit, " in Proc. ASRU. Dec. 2011, IEEE.
- (2011) Proc. ASRU
- Povey, D.¹

28
- 0030638031
- A post processing system to yield reduced word error rates: Recognizer output voting error reduction (rover)
- J. Fiscus, "A post processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER), " in Proc. ASRU. IEEE, 1997, pp. 347-354.
- (1997) Proc. ASRU IEEE , pp. 347-354
- Fiscus, J.¹

29
- 0003822743
- Cambridge University Engineering Department
- S. Young et al., "The HTK book, " Cambridge University Engineering Department, 2002.
- (2002) The HTK Book
- Young, S.¹

30
- 0025235788
- An overview of the sphinx speech recognition system
- K-F. Lee, H-W. Hon, and R. Reddy, "An overview of the SPHINX speech recognition system, " IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 38, no. 1, pp. 35-45, 1990.
- (1990) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.38 , Issue.1 , pp. 35-45
- Lee, K.-F.¹ Hon, H.-W.² Reddy, R.³

31
- 38149133882
- OpenFst: A general and efficient weighted finite-state transducer library
- C. Allauzen et al., "Open Fst: A general and efficient weighted finite-state transducer library, " Implementation and Application of Automata, pp. 11-23, 2007.
- (2007) Implementation and Application of Automata , pp. 11-23
- Allauzen, C.¹

32
- 0012330750
- The design for the wall street journal-based CSR corpus
- Association for Computational Linguistics
- D. B. Paul and J. M. Baker, "The design for the Wall Street Journal-based CSR corpus, " in Proc. Workshop on Speech and Natural Language. Association for Computational Linguistics, 1992, pp. 357-362.
- (1992) Proc. Workshop on Speech and Natural Language , pp. 357-362
- Paul, D.B.¹ Baker, J.M.²

33
- 84865780426
- English broadcast news speech (HUB4)
- Philadelphia
- J. Fiscus, J. Garofolo, M. Przybocki, W. Fisher, and D. Pallett, "English broadcast news speech (HUB4), " Linguistic Data Consortium, Philadelphia, 1997.
- (1997) Linguistic Data Consortium
- Fiscus, J.¹ Garofolo, J.² Przybocki, M.³ Fisher, W.⁴ Pallett, D.⁵

34
- 0141814662
- The icsi meeting corpus
- A. Janin et al., "The ICSI meeting corpus, " in Proc. ICASSP. IEEE, 2003, vol. 1, pp. 1-364.
- (2003) Proc. ICASSP IEEE , vol.1 , pp. 1-364
- Janin, A.¹

35
- 4544265717
- Ph.D. thesis, Cambridge University
- D. Povey, Discriminative training for large vocabulary speech recognition, Ph.D. thesis, Cambridge University, 2003.
- (2003) Discriminative Training for Large Vocabulary Speech Recognition
- Povey, D.¹

36
- 38149013231
- An improved hierarchical speaker clustering
- W. Wang, P. Lv, and Y. Yan, "An improved hierarchical speaker clustering, " Acta Acoustica, 2006.
- (2006) Acta Acoustica
- Wang, W.¹ Lv, P.² Yan, Y.³

37
- 84891308106
- SRILM - An extensible language modeling toolkit
- A. Stolcke, "SRILM - An extensible language modeling toolkit, " in Proc. ICSLP, 2002, pp. 901-904.
- (2002) Proc. ICSLP , pp. 901-904
- Stolcke, A.¹

38
- 0035278951
- Confidence measures for large vocabulary continuous speech recognition
- F. Wessel, R. Schluter, K. Macherey, and H. Ney, "Confidence measures for large vocabulary continuous speech recognition, " IEEE Transactions on Speech and Audio Processing, vol. 9, no. 3, pp. 288-298, 2001.
- (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.3 , pp. 288-298
- Wessel, F.¹ Schluter, R.² Macherey, K.³ Ney, H.⁴

39
- 15844411850
- Confidence measures for speech recognition: A survey
- H. Jiang, "Confidence measures for speech recognition: A survey, " Speech communication, vol. 45, no. 4, pp. 455-470, 2005.
- (2005) Speech Communication , vol.45 , Issue.4 , pp. 455-470
- Jiang, H.¹

40
- 84858389003
- Contextual information improves OOV detection in speech
- C. Parada, M. Dredze, D. Filimonov, and F. Jelinek, "Contextual information improves OOV detection in speech, " in Proc. NAACL, 2010.
- (2010) Proc. NAACL
- Parada, C.¹ Dredze, M.² Filimonov, D.³ Jelinek, F.⁴

41
- 0033344871
- Evaluation of word confidence for speech recognition systems
- M. Siu and H. Gish, "Evaluation of word confidence for speech recognition systems, " Computer Speech and Language, vol. 13, no. 4, pp. 299-319, 1999.
- (1999) Computer Speech and Language , vol.13 , Issue.4 , pp. 299-319
- Siu, M.¹ Gish, H.²

42
- 84867593677
- Creating ensemble of diverse maximum entropy models
- K. Audhkhasi, A. Sethy, B. Ramabhadran, and S. S. Narayanan, "Creating ensemble of diverse maximum entropy models, " in Proc. ICASSP IEEE, 2012, pp. 4845-4848.
- (2012) Proc. ICASSP IEEE , pp. 4845-4848
- Audhkhasi, K.¹ Sethy, A.² Ramabhadran, B.³ Narayanan, S.S.⁴

43
- 84878590630
- Complementary phone error training
- F. Diehl and P. C. Woodland, "Complementary phone error training, " in Interspeech, 2012.
- (2012) Interspeech
- Diehl, F.¹ Woodland, P.C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.