메뉴 건너뛰기




Volumn , Issue , 2014, Pages 1214-1218

Beyond cross-entropy: Towards better frame-level objective functions for deep neural network training in automatic speech recognition

Author keywords

Boosting difficult samples; Cross entropy; Deep neural network; Log posterior ratio

Indexed keywords

CHEMICAL ACTIVATION; CONTINUOUS SPEECH RECOGNITION; NEURAL NETWORKS; SPEECH COMMUNICATION;

EID: 84910061470     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (14)

References (27)
  • 2
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks, " in Proc. Interspeech, 2011, pp. 437-440.
    • (2011) Proc. Interspeech , pp. 437-440
    • Seide, F.1    Li, G.2    Yu, D.3
  • 3
    • 84858972572 scopus 로고    scopus 로고
    • Making deep belief networks effective for large vocabulary continuous speech recognition
    • T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition, " in Proc. ASRU, 2011, pp. 30-35.
    • (2011) Proc. ASRU , pp. 30-35
    • Sainath, T.N.1    Kingsbury, B.2    Ramabhadran, B.3    Fousek, P.4    Novak, P.5    Mohamed, A.6
  • 4
    • 84055222005 scopus 로고    scopus 로고
    • Contextdependent pre-trained deep neural networks for largevocabulary speech recognition
    • G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for largevocabulary speech recognition, " IEEE Trans. Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30- 42, 2012.
    • (2012) IEEE Trans. Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 6
    • 84878379108 scopus 로고    scopus 로고
    • Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
    • B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable minimum bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization." in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Kingsbury, B.1    Sainath, T.N.2    Soltau, H.3
  • 7
    • 84906274730 scopus 로고    scopus 로고
    • Sequence-discriminative training of deep neural networks
    • K. Vesely, A. Ghoshal, L. Burget, and D. Povey, "Sequence-discriminative training of deep neural networks, " in Proc. Interspeech, 2013, pp. 2345-2349.
    • (2013) Proc. Interspeech , pp. 2345-2349
    • Vesely, K.1    Ghoshal, A.2    Burget, L.3    Povey, D.4
  • 8
    • 84906225757 scopus 로고    scopus 로고
    • A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
    • Z. Yan, Q. Huo, and J. Xu, "A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR, " in Proc. Interspeech, 2013.
    • (2013) Proc. Interspeech
    • Yan, Z.1    Huo, Q.2    Xu, J.3
  • 9
    • 84951490428 scopus 로고
    • Review of neural networks for speech recognition
    • R. P. Lippmann, "Review of neural networks for speech recognition, " Neural computation, vol. 1, no. 1, pp. 1-38, 1989.
    • (1989) Neural Computation , vol.1 , Issue.1 , pp. 1-38
    • Lippmann, R.P.1
  • 11
    • 0027683813 scopus 로고
    • Shared-distribution hidden markov models for speech recognition
    • M.-Y. Hwang and X. Huang, "Shared-distribution hidden Markov models for speech recognition, " IEEE Trans. Speech and Audio Processing, vol. 1, no. 4, pp. 414-420, 1993.
    • (1993) IEEE Trans. Speech and Audio Processing , vol.1 , Issue.4 , pp. 414-420
    • Hwang, M.-Y.1    Huang, X.2
  • 12
    • 84906237512 scopus 로고    scopus 로고
    • Investigations on hessianfree optimization for cross-entropy training of deep neural networks
    • S. Wiesler, J. Li, and J. Xue, "Investigations on Hessianfree optimization for cross-entropy training of deep neural networks, " in Proc. Interspeech, 2013.
    • (2013) Proc. Interspeech
    • Wiesler, S.1    Li, J.2    Xue, J.3
  • 13
    • 84893676344 scopus 로고    scopus 로고
    • Rectifier nonlinearities improve neural network acoustic models
    • A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models, " in Proc. ICML, 2013.
    • (2013) Proc. ICML
    • Maas, A.L.1    Hannun, A.Y.2    Ng, A.Y.3
  • 14
    • 84893651518 scopus 로고    scopus 로고
    • Deep maxout neural networks for speech recognition
    • M. Cai, Y. Shi, and J. Liu, "Deep maxout neural networks for speech recognition, " in Proc. ASRU, 2013, pp. 291- 296.
    • (2013) Proc. ASRU , pp. 291-296
    • Cai, M.1    Shi, Y.2    Liu, J.3
  • 15
    • 84905270524 scopus 로고    scopus 로고
    • Investigation of maxout networks for speech recognition
    • P. Swietojanski, J. Li, and J. T. Huang, "Investigation of maxout networks for speech recognition, " in Proc. ICASSP, 2014.
    • (2014) Proc. ICASSP
    • Swietojanski, P.1    Li, J.2    Huang, J.T.3
  • 16
    • 84890542079 scopus 로고    scopus 로고
    • Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
    • D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, " in Proc. ICASSP, 2013, pp. 7893-7897.
    • (2013) Proc. ICASSP , pp. 7893-7897
    • Yu, D.1    Yao, K.2    Su, H.3    Li, G.4    Seide, F.5
  • 18
    • 14344259207 scopus 로고    scopus 로고
    • Solving large scale linear prediction problems using stochastic gradient descent algorithms
    • T. Zhang, "Solving large scale linear prediction problems using stochastic gradient descent algorithms, " in Proc. ICML, 2004, pp. 919-926.
    • (2004) Proc. ICML , pp. 919-926
    • Zhang, T.1
  • 20
    • 79961226155 scopus 로고    scopus 로고
    • The difficulty of training deep architectures and the effect of unsupervised pre-training
    • D. Erhan, P.-A. Manzagol, Y. Bengio, S. Bengio, and P. Vincent, "The difficulty of training deep architectures and the effect of unsupervised pre-training, " in Proc. AISTATS, 2009, pp. 153-160.
    • (2009) Proc. AISTATS , pp. 153-160
    • Erhan, D.1    Manzagol, P.-A.2    Bengio, Y.3    Bengio, S.4    Vincent, P.5
  • 21
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks, " Science, vol. 313, no. 5786, pp. 504-507, 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.R.2
  • 22
    • 0031139839 scopus 로고    scopus 로고
    • Minimum classification error rate methods for speech recognition
    • B. H. Juang, W. Chou, and C.-H. Lee, "Minimum classification error rate methods for speech recognition, " IEEE Trans. Speech and Audio Processing, vol. 5, no. 3, pp. 257-265, 1997.
    • (1997) IEEE Trans. Speech and Audio Processing , vol.5 , Issue.3 , pp. 257-265
    • Juang, B.H.1    Chou, W.2    Lee, C.-H.3
  • 23
    • 34547506259 scopus 로고    scopus 로고
    • Soft margin estimation of hidden Markov model parameters
    • J. Li, M. Yuan, and C.-H. Lee, "Soft margin estimation of hidden Markov model parameters." in Proc. Interspeech, 2006.
    • (2006) Proc. Interspeech
    • Li, J.1    Yuan, M.2    Lee, C.-H.3
  • 24
    • 64149098818 scopus 로고    scopus 로고
    • Approximate test risk bound minimization through soft margin estimation
    • J. Li, M. Yuan, and C.-H. Lee, "Approximate test risk bound minimization through soft margin estimation, " IEEE Trans. Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2393-2404, 2007.
    • (2007) IEEE Trans. Audio, Speech, and Language Processing , vol.15 , Issue.8 , pp. 2393-2404
    • Li, J.1    Yuan, M.2    Lee, C.-H.3
  • 27


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.