메뉴 건너뛰기




Volumn , Issue , 2013, Pages 3082-3086

Empirical link between hypothesis diversity and fusion performance in an ensemble of automatic speech recognition systems

Author keywords

Automatic speech recognition; Diversity; Ensemble methods; ROVER; System combination

Indexed keywords

COMPUTER APPLICATIONS; COMPUTER SIMULATION;

EID: 84906226217     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (9)

References (43)
  • 1
    • 85008006725 scopus 로고    scopus 로고
    • Advances in Arabic speech transcription at IBM under the DARPA gale program
    • H. Soltau et al., "Advances in Arabic speech transcription at IBM under the DARPA GALE program, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 5, pp. 884-894, 2009.
    • (2009) IEEE Transactions on Audio, Speech, and Language Processing , vol.17 , Issue.5 , pp. 884-894
    • Soltau, H.1
  • 2
    • 51449101515 scopus 로고    scopus 로고
    • The BBN 2007 displayless English/Iraqi speech-to-speech translation system
    • D. Stallard et al., "The BBN 2007 Displayless English/Iraqi Speech-to-Speech Translation System, " in Proc. Interspeech, 2007.
    • (2007) Proc. Interspeech
    • Stallard, D.1
  • 3
    • 34047266376 scopus 로고    scopus 로고
    • Advances in speech transcription at IBM under the DARPA EARS program
    • S. F. Chen et al., "Advances in speech transcription at IBM under the DARPA EARS program, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1596-1608, 2006.
    • (2006) IEEE Transactions on Audio, Speech, and Language Processing , vol.14 , Issue.5 , pp. 1596-1608
    • Chen, S.F.1
  • 4
    • 67649528017 scopus 로고    scopus 로고
    • The CALO meeting speech recognition and understanding system
    • G. Tur et al., "The CALO meeting speech recognition and understanding system, " in Proc. SLT. IEEE, 2008, pp. 69-72.
    • (2008) Proc. SLT. IEEE , pp. 69-72
    • Tur, G.1
  • 6
  • 9
    • 0030085913 scopus 로고    scopus 로고
    • Analysis of decision boundaries in linearly combined neural classifiers
    • K. Tumer and J. Ghosh, "Analysis of decision boundaries in linearly combined neural classifiers, " Pattern Recognition, vol. 29, no. 2, pp. 341-348, 1996.
    • (1996) Pattern Recognition , vol.29 , Issue.2 , pp. 341-348
    • Tumer, K.1    Ghosh, J.2
  • 10
    • 0032639912 scopus 로고    scopus 로고
    • Using boosting to improve a hybrid HMM/neural network speech recognizer
    • H. Schwenk, "Using boosting to improve a hybrid HMM/neural network speech recognizer, " in Proc. ICASSP IEEE, 1999, vol. 2, pp. 1009-1012.
    • (1999) Proc. ICASSP IEEE , vol.2 , pp. 1009-1012
    • Schwenk, H.1
  • 11
    • 0030351194 scopus 로고    scopus 로고
    • Boosting the performance of connectionist large vocabulary speech recognition
    • G. Cook and T. Robinson, "Boosting the performance of connectionist large vocabulary speech recognition, " in Proc. ICSLP IEEE, 1996, vol. 3, pp. 1305-1308.
    • (1996) Proc. ICSLP IEEE , vol.3 , pp. 1305-1308
    • Cook, G.1    Robinson, T.2
  • 12
    • 4544236424 scopus 로고    scopus 로고
    • Boosting HMMs with an application to speech recognition
    • C. Dimitrakakis and S. Bengio, "Boosting HMMs with an application to speech recognition, " in Proc. ICASSP. IEEE, 2004, vol. 5, pp. 618-621.
    • (2004) Proc. ICASSP IEEE , vol.5 , pp. 618-621
    • Dimitrakakis, C.1    Bengio, S.2
  • 13
    • 33646818291 scopus 로고    scopus 로고
    • Contructing ensembles of ASR systems using randomized decision trees
    • O. Siohan, B. Ramabhadran, and B. Kingsbury, "Contructing ensembles of ASR systems using randomized decision trees, " in Proc. ICASSP IEEE, 2005, vol. 1, pp. 197-200.
    • (2005) Proc. ICASSP IEEE , vol.1 , pp. 197-200
    • Siohan, O.1    Ramabhadran, B.2    Kingsbury, B.3
  • 14
    • 80055092534 scopus 로고    scopus 로고
    • Boosting systems for large vocabulary continuous speech recognition
    • G. Saon and H. Soltau, "Boosting systems for large vocabulary continuous speech recognition, " Speech Communication, vol. 54, no. 2, pp. 212-218, 2012.
    • (2012) Speech Communication , vol.54 , Issue.2 , pp. 212-218
    • Saon, G.1    Soltau, H.2
  • 15
    • 0030211964 scopus 로고    scopus 로고
    • Bagging predictors
    • L. Breiman, "Bagging predictors, " Machine Learning, vol. 24, no. 2, pp. 123-140, 1996.
    • (1996) Machine Learning , vol.24 , Issue.2 , pp. 123-140
    • Breiman, L.1
  • 16
    • 84983110889 scopus 로고
    • A decision-theoretic generalization of on-line learning and an application to boosting
    • Springer
    • Y. Freund and R. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting, " in Computational learning theory. Springer, 1995, pp. 23-37.
    • (1995) Computational Learning Theory , pp. 23-37
    • Freund, Y.1    Schapire, R.2
  • 17
    • 0035470889 scopus 로고    scopus 로고
    • Greedy function approximation: A gradient boosting machine
    • J. H. Friedman, "Greedy function approximation: A gradient boosting machine, " Ann. Statistics, vol. 29, no. 5, pp. 1189-1232, 2001.
    • (2001) Ann. Statistics , vol.29 , Issue.5 , pp. 1189-1232
    • Friedman, J.H.1
  • 18
    • 0035478854 scopus 로고    scopus 로고
    • Random forests
    • L. Breiman, "Random forests, " Machine learning, vol. 45, no. 1, pp. 5-32, 2001.
    • (2001) Machine Learning , vol.45 , Issue.1 , pp. 5-32
    • Breiman, L.1
  • 20
    • 84867596093 scopus 로고    scopus 로고
    • Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation
    • K. Audhkhasi, P. G. Georgiou, and S.S. Narayanan, "Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation, " in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Audhkhasi, K.1    Georgiou, P.G.2    Narayanan, S.S.3
  • 21
    • 84865764400 scopus 로고    scopus 로고
    • Reliability weighted acoustic model adaptation using crowd-sourced transcriptions
    • K. Audhkhasi, P. G. Georgiou, and S. S. Narayanan, "Reliability weighted acoustic model adaptation using crowd-sourced transcriptions, " in Proc. Interspeech, 2011.
    • (2011) Proc. Interspeech
    • Audhkhasi, K.1    Georgiou, P.G.2    Narayanan, S.S.3
  • 22
    • 80051628698 scopus 로고    scopus 로고
    • Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics
    • K. Audhkhasi, P. G. Georgiou, and S. S. Narayanan, "Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics, " in Proc. ICASSP, 2011.
    • (2011) Proc. ICASSP
    • Audhkhasi, K.1    Georgiou, P.G.2    Narayanan, S.S.3
  • 23
    • 84865734254 scopus 로고    scopus 로고
    • Speaking to the crowd: Looking at past achievements in using crowdsourcing for speech and predicting future challenges
    • G. Parent and M. Eskenazi, "Speaking to the crowd: looking at past achievements in using crowdsourcing for speech and predicting future challenges, " in Proc. Interspeech, 2011.
    • (2011) Proc. Interspeech
    • Parent, G.1    Eskenazi, M.2
  • 24
    • 79959821909 scopus 로고    scopus 로고
    • Automatic estimation of transcription accuracy and difficulty
    • B. C. Roy, S. Vasoughi, and D. Roy, "Automatic estimation of transcription accuracy and difficulty, " in Proc. Interspeech ISCA, 2010, pp. 1902-1905.
    • (2010) Proc. Interspeech ISCA , pp. 1902-1905
    • Roy, B.C.1    Vasoughi, S.2    Roy, D.3
  • 25
    • 78049407752 scopus 로고    scopus 로고
    • Using the amazon mechanical turk for transcription of spoken language
    • M. Marge, S. Banerjee, and A. I. Rudnicky, "Using the Amazon Mechanical Turk for transcription of spoken language, " in Proc. ICASSP, 2010.
    • (2010) Proc. ICASSP
    • Marge, M.1    Banerjee, S.2    Rudnicky, A.I.3
  • 26
    • 79958275518 scopus 로고    scopus 로고
    • Cheap, fast and good enough: Automatic speech recognition with non-expert transcription
    • S. Novotney and C. Callison-Burch, "Cheap, fast and good enough: Automatic speech recognition with non-expert transcription, " in Proc. NAACL-HLT, 2010.
    • (2010) Proc. NAACL-HLT
    • Novotney, S.1    Callison-Burch, C.2
  • 27
    • 84858953642 scopus 로고    scopus 로고
    • The kaldi speech recognition toolkit
    • Dec., IEEE
    • D. Povey et al., "The Kaldi Speech Recognition Toolkit, " in Proc. ASRU. Dec. 2011, IEEE.
    • (2011) Proc. ASRU
    • Povey, D.1
  • 28
    • 0030638031 scopus 로고    scopus 로고
    • A post processing system to yield reduced word error rates: Recognizer output voting error reduction (rover)
    • J. Fiscus, "A post processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER), " in Proc. ASRU. IEEE, 1997, pp. 347-354.
    • (1997) Proc. ASRU IEEE , pp. 347-354
    • Fiscus, J.1
  • 29
    • 0003822743 scopus 로고    scopus 로고
    • Cambridge University Engineering Department
    • S. Young et al., "The HTK book, " Cambridge University Engineering Department, 2002.
    • (2002) The HTK Book
    • Young, S.1
  • 31
    • 38149133882 scopus 로고    scopus 로고
    • OpenFst: A general and efficient weighted finite-state transducer library
    • C. Allauzen et al., "Open Fst: A general and efficient weighted finite-state transducer library, " Implementation and Application of Automata, pp. 11-23, 2007.
    • (2007) Implementation and Application of Automata , pp. 11-23
    • Allauzen, C.1
  • 32
    • 0012330750 scopus 로고
    • The design for the wall street journal-based CSR corpus
    • Association for Computational Linguistics
    • D. B. Paul and J. M. Baker, "The design for the Wall Street Journal-based CSR corpus, " in Proc. Workshop on Speech and Natural Language. Association for Computational Linguistics, 1992, pp. 357-362.
    • (1992) Proc. Workshop on Speech and Natural Language , pp. 357-362
    • Paul, D.B.1    Baker, J.M.2
  • 34
    • 0141814662 scopus 로고    scopus 로고
    • The icsi meeting corpus
    • A. Janin et al., "The ICSI meeting corpus, " in Proc. ICASSP. IEEE, 2003, vol. 1, pp. 1-364.
    • (2003) Proc. ICASSP IEEE , vol.1 , pp. 1-364
    • Janin, A.1
  • 36
    • 38149013231 scopus 로고    scopus 로고
    • An improved hierarchical speaker clustering
    • W. Wang, P. Lv, and Y. Yan, "An improved hierarchical speaker clustering, " Acta Acoustica, 2006.
    • (2006) Acta Acoustica
    • Wang, W.1    Lv, P.2    Yan, Y.3
  • 37
    • 84891308106 scopus 로고    scopus 로고
    • SRILM - An extensible language modeling toolkit
    • A. Stolcke, "SRILM - An extensible language modeling toolkit, " in Proc. ICSLP, 2002, pp. 901-904.
    • (2002) Proc. ICSLP , pp. 901-904
    • Stolcke, A.1
  • 39
    • 15844411850 scopus 로고    scopus 로고
    • Confidence measures for speech recognition: A survey
    • H. Jiang, "Confidence measures for speech recognition: A survey, " Speech communication, vol. 45, no. 4, pp. 455-470, 2005.
    • (2005) Speech Communication , vol.45 , Issue.4 , pp. 455-470
    • Jiang, H.1
  • 41
    • 0033344871 scopus 로고    scopus 로고
    • Evaluation of word confidence for speech recognition systems
    • M. Siu and H. Gish, "Evaluation of word confidence for speech recognition systems, " Computer Speech and Language, vol. 13, no. 4, pp. 299-319, 1999.
    • (1999) Computer Speech and Language , vol.13 , Issue.4 , pp. 299-319
    • Siu, M.1    Gish, H.2
  • 43
    • 84878590630 scopus 로고    scopus 로고
    • Complementary phone error training
    • F. Diehl and P. C. Woodland, "Complementary phone error training, " in Interspeech, 2012.
    • (2012) Interspeech
    • Diehl, F.1    Woodland, P.C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.