메뉴 건너뛰기




Volumn 08-12-September-2016, Issue , 2016, Pages 7-11

The IBM 2016 English conversational telephone speech recognition system

Author keywords

Conversational speech recognition; Convolutional neural networks; Recurrent neural networks

Indexed keywords

COMPUTATIONAL LINGUISTICS; CONVOLUTION; MODELING LANGUAGES; NEURAL NETWORKS; RECURRENT NEURAL NETWORKS; SPEECH COMMUNICATION; SPEECH PROCESSING; TELEPHONE SETS;

EID: 84994201246     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: 10.21437/Interspeech.2016-1460     Document Type: Conference Paper
Times cited : (85)

References (27)
  • 1
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • F. Seide, G. Li, X. Chien, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in Proc. ASRU, 2011.
    • (2011) Proc. ASRU
    • Seide, F.1    Li, G.2    Chien, X.3    Yu, D.4
  • 2
    • 84906214784 scopus 로고    scopus 로고
    • Exploring convolutional neural network structures and optimization techniques for speech recognition
    • O. Abdel-Hamid, L. Deng, and D. Yu, "Exploring convolutional neural network structures and optimization techniques for speech recognition." in INTERSPEECH, 2013, pp. 3366-3370.
    • (2013) INTERSPEECH , pp. 3366-3370
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 5
    • 84959115289 scopus 로고    scopus 로고
    • A time delay neural network architecture for efficient modeling of long temporal contexts
    • V. Peddinti, D. Povey, and S. Khudanpur, "A time delay neural network architecture for efficient modeling of long temporal contexts," in Proceedings of INTERSPEECH, 2015.
    • (2015) Proceedings of INTERSPEECH
    • Peddinti, V.1    Povey, D.2    Khudanpur, S.3
  • 13
    • 84878379108 scopus 로고    scopus 로고
    • Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization
    • B. Kingsbury, T. Sainath, and H. Soltau, "Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization," in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Kingsbury, B.1    Sainath, T.2    Soltau, H.3
  • 16
    • 84978755117 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale image recognition
    • arXiv:1409.1556
    • K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," CoRR arXiv:1409.1556, 2014.
    • (2014) CoRR
    • Simonyan, K.1    Zisserman, A.2
  • 17
    • 84905265980 scopus 로고    scopus 로고
    • Joint training of convolutional and non-convolutional neural networks
    • H. Soltau, G. Saon, and T. N. Sainath, "Joint training of convolutional and non-convolutional neural networks," to Proc. ICASSP, 2014.
    • (2014) Proc. ICASSP
    • Soltau, H.1    Saon, G.2    Sainath, T.N.3
  • 19
    • 84890543852 scopus 로고    scopus 로고
    • Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription
    • H. Su, G. Li, D. Yu, and F. Seide, "Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription," Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Su, H.1    Li, G.2    Yu, D.3    Seide, F.4
  • 20
    • 0033329799 scopus 로고    scopus 로고
    • An empirical study of smoothing techniques for language modeling
    • S. F. Chen and J. Goodman, "An empirical study of smoothing techniques for language modeling," Computer Speech & Language, vol. 13, no. 4, pp. 359-393, 1999.
    • (1999) Computer Speech & Language , vol.13 , Issue.4 , pp. 359-393
    • Chen, S.F.1    Goodman, J.2
  • 22
    • 84863387613 scopus 로고    scopus 로고
    • Shrinking exponential language models
    • S. F. Chen, "Shrinking exponential language models," in Proc. NAACL-HLT, 2009, pp. 468-476.
    • (2009) Proc. NAACL-HLT , pp. 468-476
    • Chen, S.F.1
  • 24
    • 85055309630 scopus 로고    scopus 로고
    • Ph.D. dissertation, Johns Hopkins University, Baltimore, MD, USA
    • A. Emami, "A neural syntactic language model," Ph.D. dissertation, Johns Hopkins University, Baltimore, MD, USA, 2006.
    • (2006) A Neural Syntactic Language Model
    • Emami, A.1
  • 25
    • 33847610331 scopus 로고    scopus 로고
    • Continuous space language models
    • H. Schwenk, "Continuous space language models," Computer Speech & Language, vol. 21, no. 3, pp. 492-518, 2007.
    • (2007) Computer Speech & Language , vol.21 , Issue.3 , pp. 492-518
    • Schwenk, H.1
  • 26
    • 44849092930 scopus 로고    scopus 로고
    • Empirical study of neural network language models for Arabic speech recognition
    • A. Emami and L. Mangu, "Empirical study of neural network language models for Arabic speech recognition," in Proc. ASRU, 2007, pp. 147-152.
    • (2007) Proc. ASRU , pp. 147-152
    • Emami, A.1    Mangu, L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.