메뉴 건너뛰기




Volumn , Issue , 2016, Pages 639-646

Cambridge university transcription systems for the multi-genre broadcast challenge

Author keywords

broadcast transcription; deep neural networks; HTK; Kaldi; Speech recognition

Indexed keywords

COMPUTATIONAL LINGUISTICS; RECURRENT NEURAL NETWORKS; TRANSCRIPTION;

EID: 84964475976     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ASRU.2015.7404856     Document Type: Conference Paper
Times cited : (31)

References (55)
  • 1
    • 84964518874 scopus 로고    scopus 로고
    • http://htk.eng.cam.ac.uk
  • 2
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition
    • Kyoto
    • O. Abdel-Hamid, A. Mohamed, H. Jiang, &G. Penn, "Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition", Proc. ICASSP, Kyoto, 2012
    • (2012) Proc. ICASSP
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 5
    • 0028392483 scopus 로고
    • Learning long-term dependencies with gradient descent is difficult
    • Y. Bengio, P. Simard, &P. Frasconi, "Learning long-term dependencies with gradient descent is difficult", IEEE Transactions on Neural Networks, vol. 5, pp. 157-166, 1994
    • (1994) IEEE Transactions on Neural Networks , vol.5 , pp. 157-166
    • Bengio, Y.1    Simard, P.2    Frasconi, P.3
  • 6
    • 41049105254 scopus 로고    scopus 로고
    • Joint-sequence models for graphemeto-phoneme conversion
    • M. Bisani &H. Ney, "Joint-sequence models for graphemeto-phoneme conversion, Speech Communication, vol. 50, no. 5, 2008
    • (2008) Speech Communication , vol.50 , Issue.5
    • Bisani, M.1    Ney, H.2
  • 8
    • 4544253838 scopus 로고    scopus 로고
    • Improving broadcast news transcription by lightly supervised discriminative training
    • Montreal
    • H.Y. Chan &P.C.Woodland, "Improving broadcast news transcription by lightly supervised discriminative training", Proc. ICASSP, Montreal, 2004
    • (2004) Proc. ICASSP
    • Chan, H.Y.1    Woodland, P.C.2
  • 9
    • 84910067710 scopus 로고    scopus 로고
    • Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch
    • Singapore
    • X. Chen, Y. Wang, X. Liu, M.J.F. Gales, &P.C. Woodland, "Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch", Proc. Interspeech, Singapore, 2014
    • (2014) Proc. Interspeech
    • Chen, X.1    Wang, Y.2    Liu, X.3    Gales, M.J.F.4    Woodland, P.C.5
  • 10
    • 84959155988 scopus 로고    scopus 로고
    • Recurrent neural network language model adaptation for multi-genre broadcast speech recognition
    • Dresden
    • X. Chen, T. Tan, X. Liu, P. Lanchantin, M.J.F. Gales &P.C. Woodland, "Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition", Proc. Interspeech, Dresden, 2015
    • (2015) Proc. Interspeech
    • Chen, X.1    Tan, T.2    Liu, X.3    Lanchantin, P.4    Gales, M.J.F.5    Woodland, P.C.6
  • 11
    • 4544253834 scopus 로고    scopus 로고
    • Posterior probability decoding, confidence estimation and system combination
    • College Park, MD
    • G. Evermann &P.C. Woodland, "Posterior probability decoding, confidence estimation and system combination", Proc. Speech Transcription Workshop, College Park, MD, 2000
    • (2000) Proc. Speech Transcription Workshop
    • Evermann, G.1    Woodland, P.C.2
  • 12
    • 0030638031 scopus 로고    scopus 로고
    • A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
    • Santa Barbara
    • J. Fiscus, "A post-processing system to yield reduced word error rates: recogniser output voting error reduction (ROVER), iProc. ASRU Workshop, Santa Barbara, 1997
    • (1997) IProc. ASRU Workshop
    • Fiscus, J.1
  • 13
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M.J.F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition", Computer Speech and Langauge, vol. 12, pp. 75-98, 1997
    • (1997) Computer Speech and Langauge , vol.12 , pp. 75-98
    • Gales, M.J.F.1
  • 14
    • 0032638856 scopus 로고    scopus 로고
    • Semi-tied covariance matrices for hidden Markov models
    • M.J.F. Gales, "Semi-tied covariance matrices for hidden Markov models", IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, 1999
    • (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
    • Gales, M.J.F.1
  • 16
    • 0036567851 scopus 로고    scopus 로고
    • The LIMSI broadcast news transcription system
    • J.L. Gauvain, L. Lamel, &G. Adda "The LIMSI broadcast news transcription system" Speech communication, vol. 37, no. 1, pp. 89-108, 2002
    • (2002) Speech Communication , vol.37 , Issue.1 , pp. 89-108
    • Gauvain, J.L.1    Lamel, L.2    Adda, G.3
  • 18
    • 51449103447 scopus 로고    scopus 로고
    • Optimizing bottle-neck features for LVCSR
    • Las Vegas
    • F. Grezl &P. Fousek, "Optimizing bottle-neck features for LVCSR", Proc. ICASSP, Las Vegas, 2008
    • (2008) Proc. ICASSP
    • Grezl, F.1    Fousek, P.2
  • 20
    • 84959162419 scopus 로고    scopus 로고
    • I-vector estimation using informative priors for adaptation of deep neural networks
    • Dresden
    • P. Karanasou, M.J.F. Gales &P.C. Woodland, "I-vector estimation using informative priors for adaptation of deep neural networks", Proc. Interspeech, Dresden, 2015
    • (2015) Proc. Interspeech
    • Karanasou, P.1    Gales, M.J.F.2    Woodland, P.C.3
  • 22
    • 70349213445 scopus 로고    scopus 로고
    • Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
    • Taipei
    • B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling, Proc. ICASSP, Taipei, 2009
    • (2009) Proc. ICASSP
    • Kingsbury, B.1
  • 24
    • 0036460908 scopus 로고    scopus 로고
    • Lightly supervised and unsupervised acoustic model training
    • L. Lamel, J.L. Gauvain, &G. Adda, "Lightly supervised and unsupervised acoustic model training", Computer Speech &Language, vol. 16, no. 1, pp. 115-129, 2002
    • (2002) Computer Speech &Language , vol.16 , Issue.1 , pp. 115-129
    • Lamel, L.1    Gauvain, J.L.2    Adda, G.3
  • 26
    • 0141703325 scopus 로고    scopus 로고
    • Automatic complexity control for HLDA systems
    • Hong Kong
    • X. Liu, M.J.F. Gales, &P.C. Woodland, "Automatic complexity control for HLDA systems", Proc. ICASSP, Hong Kong, 2003
    • (2003) Proc. ICASSP
    • Liu, X.1    Gales, M.J.F.2    Woodland, P.C.3
  • 27
    • 84905240726 scopus 로고    scopus 로고
    • Efficient lattice rescoring using recurrent neural network language models
    • Florence
    • X. Liu, Y. Wang, X. Chen, M.J.F. Gales, &P.C. Woodland, "Efficient lattice rescoring using recurrent neural network language models", Proc. ICASSP, Florence, 2014
    • (2014) Proc. ICASSP
    • Liu, X.1    Wang, Y.2    Chen, X.3    Gales, M.J.F.4    Woodland, P.C.5
  • 28
    • 84959109976 scopus 로고    scopus 로고
    • The Cambridge university 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation
    • Dresden
    • X. Liu, F. Flego, L. Wang, C. Zhang, M.J.F. Gales, &P.C. Woodland, "The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation", Proc. Interspeech, Dresden, 2015
    • (2015) Proc. Interspeech
    • Liu, X.1    Flego, F.2    Wang, L.3    Zhang, C.4    Gales, M.J.F.5    Woodland, P.C.6
  • 29
    • 0034296009 scopus 로고    scopus 로고
    • Finding consensus in speech recognition: Word error minimization and other applications of confusion networks
    • L. Mangu, E. Brill, A. Stolcke, "Finding consensus in speech recognition: word error minimization and other applications of confusion networks", Computer Speech and Language, Vol. 14, No. 4, pp. 373-400, 2000
    • (2000) Computer Speech and Language , vol.14 , Issue.4 , pp. 373-400
    • Mangu, L.1    Brill, E.2    Stolcke, A.3
  • 32
    • 0036296863 scopus 로고    scopus 로고
    • Minimum phone error and I-smoothing for improved discriminative training
    • Orlando
    • D. Povey &P.C. Woodland, "Minimum phone error and I-smoothing for improved discriminative training", Proc. ICASSP, Orlando, 2002
    • (2002) Proc. ICASSP
    • Povey, D.1    Woodland, P.C.2
  • 34
    • 70450180978 scopus 로고    scopus 로고
    • Robust LTS rules with the Combilex speech technology lexicon
    • Brighton
    • K. Richmond, R. Clark &S. Fitt, "Robust LTS rules with the Combilex speech technology lexicon", Proc. Interspeech, Brighton, 2009
    • (2009) Proc. Interspeech
    • Richmond, K.1    Clark, R.2    Fitt, S.3
  • 35
    • 79959836077 scopus 로고    scopus 로고
    • On generating Combilex pronunciations via morphological analysis
    • Makuhari, Japan
    • K. Richmond, R. Clark &S. Fitt, "On generating Combilex pronunciations via morphological analysis", Proc. Interspeech, Makuhari, Japan, 2010
    • (2010) Proc. Interspeech
    • Richmond, K.1    Clark, R.2    Fitt, S.3
  • 36
    • 84910046405 scopus 로고    scopus 로고
    • Long short-term memory recurrent neural network architectures for large scale acoustic modeling
    • Singapore
    • H. Sak, A. Senior, &F. Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling", Proc. Interspeech, Singapore, 2014
    • (2014) Proc. Interspeech
    • Sak, H.1    Senior, A.2    Beaufays, F.3
  • 38
    • 84946037134 scopus 로고    scopus 로고
    • Convolutional, long short-term memory, fully connected deep neural networks
    • Brisbane
    • T.N. Sainath, O. Vinyals, A. Senior, &Hasim Sak, "Convolutional, long short-term memory, fully connected deep neural networks", Proc. ICASSP, Brisbane, 2015
    • (2015) Proc. ICASSP
    • Sainath, T.N.1    Vinyals, O.2    Senior, A.3    Sak, H.4
  • 39
    • 84890446559 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks
    • Hawaii
    • F. Seide, G. Li, X. Chen, &D. Yu, "Feature engineering in context-dependent deep neural networks", Proc. ASRU Workshop, Hawaii, 2011
    • (2011) Proc. ASRU Workshop
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4
  • 40
    • 84906240855 scopus 로고    scopus 로고
    • Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system
    • Lyon
    • Y. Si, Q. Zhang, T. Li, J. Pan, &Y. Yan, "Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system", Proc. Interspeech, Lyon, 2013
    • (2013) Proc. Interspeech
    • Si, Y.1    Zhang, Q.2    Li, T.3    Pan, J.4    Yan, Y.5
  • 43
    • 84891308106 scopus 로고    scopus 로고
    • SRILM: An extensible language modeling toolkit
    • Denver
    • A. Stolcke, "SRILM an extensible language modeling toolkit", Proc. ICSLP, Denver, 2002
    • (2002) Proc. ICSLP
    • Stolcke, A.1
  • 44
    • 84890492591 scopus 로고    scopus 로고
    • Revisiting hybrid and GMM-HMM system combination techniques
    • Vancouver
    • P. Swietojanski, A. Ghoshal, &S. Renals, "Revisiting hybrid and GMM-HMM system combination techniques", Proc. ICASSP, Vancouver, 2013
    • (2013) Proc. ICASSP
    • Swietojanski, P.1    Ghoshal, A.2    Renals, S.3
  • 45
    • 84983119674 scopus 로고    scopus 로고
    • Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
    • Lake Tahoe
    • P. Swietojanski &S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models", Proc. IWSLT, Lake Tahoe, 2014
    • (2014) Proc. IWSLT
    • Swietojanski, P.1    Renals, S.2
  • 46
    • 84946032695 scopus 로고    scopus 로고
    • Differentiable pooling for unsupervised speaker adaptation
    • Brisbane
    • P. Swietojanski &S. Renals, "Differentiable pooling for unsupervised speaker adaptation", Proc. ICASSP, Brisbane, 2015
    • (2015) Proc. ICASSP
    • Swietojanski, P.1    Renals, S.2
  • 49
    • 0036567794 scopus 로고    scopus 로고
    • The development of the HTK broadcast news transcription system: An overview
    • P.C.Woodland, "The development of the HTK broadcast news transcription system: An overview", Speech Communication, vol. 37, no. 1, pp. 47-67, 2002
    • (2002) Speech Communication , vol.37 , Issue.1 , pp. 47-67
    • Woodland, P.C.1
  • 50
    • 79953250475 scopus 로고    scopus 로고
    • Minimum Bayes risk decoding and system combination based on a recursion for edit distance
    • H. Xu, D. Povey, L. Mangu, &J. Zhu, "Minimum Bayes risk decoding and system combination based on a recursion for edit distance", Computer Speech &Language, vol. 25, no. 4, pp. 802-828, 2011
    • (2011) Computer Speech &Language , vol.25 , Issue.4 , pp. 802-828
    • Xu, H.1    Povey, D.2    Mangu, L.3    Zhu, J.4
  • 52
    • 84923929378 scopus 로고    scopus 로고
    • Fuse deep neural network and Gaussian mixture model systems
    • Springer, London
    • D. Yu &L. Deng, "Fuse deep neural network and Gaussian mixture model systems", Automatic Speech Recognition: A Deep Learning Approach, pp. 177-191. Springer, London, 2015
    • (2015) Automatic Speech Recognition: A Deep Learning Approach , pp. 177-191
    • Yu, D.1    Deng, L.2
  • 53
    • 84959142742 scopus 로고    scopus 로고
    • A general artificial neural network extension for HTK
    • Dresden
    • C. Zhang &P.C. Woodland, "A general artificial neural network extension for HTK", Proc. Interspeech, Dresden, 2015
    • (2015) Proc. Interspeech
    • Zhang, C.1    Woodland, P.C.2
  • 54
    • 84959174678 scopus 로고    scopus 로고
    • Parameterised sigmoid and ReLU hidden activation functions for DNN acoustic modelling
    • Dresden
    • C. Zhang &P.C. Woodland, "Parameterised sigmoid and ReLU hidden activation functions for DNN acoustic modelling", Proc. Interspeech, Dresden, 2015
    • (2015) Proc. Interspeech
    • Zhang, C.1    Woodland, P.C.2
  • 55
    • 84946061232 scopus 로고    scopus 로고
    • Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data
    • Brisbane
    • Y. Zhao, J.-Y. Li, J. Xue, &Y.-F. Gong, "Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data", Proc. ICASSP, Brisbane, 2015
    • (2015) Proc. ICASSP
    • Zhao, Y.1    Li, J.-Y.2    Xue, J.3    Gong, Y.-F.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.