메뉴 건너뛰기




Volumn 2015-August, Issue , 2015, Pages 4360-4364

Far-field speech recognition using CNN-DNN-HMM with convolution in time

Author keywords

convolutional neural network; deep neural network; Far field speech recognition; reverberation

Indexed keywords

AUDIO SIGNAL PROCESSING; CONVOLUTION; DEEP NEURAL NETWORKS; NEURAL NETWORKS; REVERBERATION; SPEECH; SPEECH COMMUNICATION;

EID: 84946020145     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2015.7178794     Document Type: Conference Paper
Times cited : (29)

References (30)
  • 1
    • 0029306621 scopus 로고
    • Continuous speech recognition: An introduction to the hybrid HMM/connectionist approach
    • N. Morgan and H. Bourlard, Continuous speech recognition: an introduction to the hybrid HMM/connectionist approach, IEEE Signal Process. Mag., vol. 12, no. 3, pp. 24-42, 1995
    • (1995) IEEE Signal Process. Mag , vol.12 , Issue.3 , pp. 24-42
    • Morgan, N.1    Bourlard, H.2
  • 2
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • G. E. Dahl, D. Yu, L. Deng, and A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 1, pp. 30-42, 2012
    • (2012) IEEE Trans. Audio, Speech, Language Process , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 4
    • 84919935784 scopus 로고    scopus 로고
    • Environmentally robust ASR front-end for deep neural network acoustic models
    • T. Yoshioka and M. J. F. Gales, Environmentally robust ASR front-end for deep neural network acoustic models, Comp. Speech, Language, vol. 31, no. 1, pp. 65-86, 2015
    • (2015) Comp. Speech, Language , vol.31 , Issue.1 , pp. 65-86
    • Yoshioka, T.1    Gales, M.J.F.2
  • 7
    • 85032751613 scopus 로고    scopus 로고
    • Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition
    • T. Yoshioka, A. Sehr, M. Delcroix, K. Kinoshita, R. Maas, T. Nakatani, and W. Kellermann, Making machines understand us in reverberant rooms: robustness against reverberation for automatic speech recognition, IEEE Signal Process. Mag., vol. 29, no. 6, pp. 114-126, 2012
    • (2012) IEEE Signal Process. Mag , vol.29 , Issue.6 , pp. 114-126
    • Yoshioka, T.1    Sehr, A.2    Delcroix, M.3    Kinoshita, K.4    Maas, R.5    Nakatani, T.6    Kellermann, W.7
  • 8
    • 84867693894 scopus 로고    scopus 로고
    • Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening
    • T. Yoshioka and T. Nakatani, Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening, IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 10, pp. 2707-2720, 2012
    • (2012) IEEE Trans. Audio, Speech, Language Process , vol.20 , Issue.10 , pp. 2707-2720
    • Yoshioka, T.1    Nakatani, T.2
  • 9
    • 77955673019 scopus 로고    scopus 로고
    • Model-based feature enhancement for reverberant speech recognition
    • A. Krueger and R. Haeb-Umbach, Model-based feature enhancement for reverberant speech recognition, IEEE Trans. Audio, Speech, Language Process., vol. 18, no. 7, pp. 1692-1707, 2010
    • (2010) IEEE Trans. Audio, Speech, Language Process , vol.18 , Issue.7 , pp. 1692-1707
    • Krueger, A.1    Haeb-Umbach, R.2
  • 11
    • 77955683144 scopus 로고    scopus 로고
    • Reverberation modelbased decoding in the logmelspec domain for robust distanttalking speech recognition
    • A. Sehr, R. Maas, and W. Kellermann, Reverberation modelbased decoding in the logmelspec domain for robust distanttalking speech recognition, IEEE Trans. Audio, Speech, Language Process., vol. 18, no. 7, pp. 1676-1691, 2010
    • (2010) IEEE Trans. Audio, Speech, Language Process , vol.18 , Issue.7 , pp. 1676-1691
    • Sehr, A.1    Maas, R.2    Kellermann, W.3
  • 13
  • 14
    • 14344274593 scopus 로고    scopus 로고
    • A new method based on spectral subtraction for speech dereverberation
    • K. Lebart, J. M. Boucher, and P. N. Denbigh, A new method based on spectral subtraction for speech dereverberation, Acta Acustica United with Acustica, vol. 87, pp. 359-366, 2001
    • (2001) Acta Acustica United with Acustica , vol.87 , pp. 359-366
    • Lebart, K.1    Boucher, J.M.2    Denbigh, P.N.3
  • 16
    • 78049354148 scopus 로고    scopus 로고
    • Maximum-likelihood-based cepstral inverse filtering for blind speech dereverberation
    • K. Kumar and R. Stern, Maximum-likelihood-based cepstral inverse filtering for blind speech dereverberation, in Proc. Int. Conf. Acoust., Speech, Signal Process., 2010, pp. 4282-4285
    • (2010) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 4282-4285
    • Kumar, K.1    Stern, R.2
  • 18
    • 84905252069 scopus 로고    scopus 로고
    • Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition
    • L. Tóth, Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition, in Proc. Int. Conf. Acoust., Speech, Signal Process., 2014, pp. 190-194
    • (2014) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 190-194
    • Tóth, L.1
  • 19
    • 84906214784 scopus 로고    scopus 로고
    • Exploring convolutional neural network structures and optimization techniques for speech recognition
    • O. Abdel-Hamid, L. Deng, and D. Yu, Exploring convolutional neural network structures and optimization techniques for speech recognition, in Proc. Interspeech, 2013, pp. 3366-3370
    • (2013) Proc. Interspeech , pp. 3366-3370
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 21
    • 84901999583 scopus 로고    scopus 로고
    • Convolutional neural networks for distant speech recognition
    • P. Swietojanski, A. Ghoshal, and S. Renals, Convolutional neural networks for distant speech recognition, IEEE Signal Process. Letters, vol. 21, no. 9, pp. 1120-1124, 2014
    • (2014) IEEE Signal Process. Letters , vol.21 , Issue.9 , pp. 1120-1124
    • Swietojanski, P.1    Ghoshal, A.2    Renals, S.3
  • 25
  • 30
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • F. Seide, G. Li, X. Chen, and D. Yu, Feature engineering in context-dependent deep neural networks for conversational speech transcription, in Proc. Workshop. Automat. Speech Recognition, Understanding, 2011, pp. 24-29
    • (2011) Proc. Workshop. Automat. Speech Recognition, Understanding , pp. 24-29
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.