메뉴 건너뛰기




Volumn , Issue , 2016, Pages 317-323

Time-frequency convolutional networks for robust speech recognition

Author keywords

deep convolution networks; robust features; robust speech recognition; time frequency convolution nets

Indexed keywords

CONVOLUTION; REVERBERATION; SPEECH;

EID: 84964422542     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ASRU.2015.7404811     Document Type: Conference Paper
Times cited : (46)

References (28)
  • 1
    • 84055211743 scopus 로고    scopus 로고
    • Acoustic modeling using deep belief networks
    • A. Mohamed, G.E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. on ASLP, vol. 20, no. 1, pp. 14-22, 2012
    • (2012) IEEE Trans. on ASLP , vol.20 , Issue.1 , pp. 14-22
    • Mohamed, A.1    Dahl, G.E.2    Hinton, G.3
  • 2
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," Proc. of Interspeech, 2011
    • (2011) Proc. of Interspeech
    • Seide, F.1    Li, G.2    Yu, D.3
  • 3
    • 84910065702 scopus 로고    scopus 로고
    • Acoustic modeling with deep neural networks using raw time signal for LVCSR
    • Z. Tuske, P., Golik, R., Schluter, and H. Ney, "Acoustic modeling with deep neural networks using raw time signal for LVCSR," Proc. of Interspeech, 2014
    • (2014) Proc. of Interspeech
    • Tuske, Z.1    Golik, P.2    Schluter, R.3    Ney, H.4
  • 5
    • 84890492030 scopus 로고    scopus 로고
    • An investigation of deep neural networks for noise robust speech recognition
    • M. Seltzer, D. Yu, and Y. Wang, "An investigation of deep neural networks for noise robust speech recognition", Proc of ICASSP, 2013
    • (2013) Proc of ICASSP
    • Seltzer, M.1    Yu, D.2    Wang, Y.3
  • 6
    • 84910075252 scopus 로고    scopus 로고
    • Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions
    • V. Mitra, W. Wang, H. Franco, Y. Lei, C. Bartels, and M. Graciarena, "Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions," in Proc. of Interspeech, 2014
    • (2014) Proc. of Interspeech
    • Mitra, V.1    Wang, W.2    Franco, H.3    Lei, Y.4    Bartels, C.5    Graciarena, M.6
  • 9
    • 84946693063 scopus 로고    scopus 로고
    • Deep convolutional nets and robust features for reverberation-robust speech recognition
    • V. Mitra, W. Wang, and H. Franco, "deep convolutional nets and robust features for reverberation-robust speech recognition," in Proc. of SLT, pp. 548-553, 2014
    • (2014) Proc. of SLT , pp. 548-553
    • Mitra, V.1    Wang, W.2    Franco, H.3
  • 11
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition
    • O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition," Proc. of ICASSP, pp. 4277-4280, 2012
    • (2012) Proc. of ICASSP , pp. 4277-4280
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 12
    • 84858953286 scopus 로고    scopus 로고
    • Vocal tract length normalization for LVCSR
    • Carnegie Mellon University
    • P. Zhan and A Waibel, "Vocal tract length normalization for LVCSR," in Tech. Rep. CMU-LTI-97-150. Carnegie Mellon University, 1997
    • (1997) Tech. Rep. CMU-LTI-97-150
    • Zhan, P.1    Waibel, A.2
  • 13
    • 84863380535 scopus 로고    scopus 로고
    • Unsupervised feature learning for audio classification using convolutional deep belief networks
    • H. Lee, P. Pham, Y. Largman, and A. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," Proc. of Adv. Neural Inf. Process. Syst. 22, pp. 1096-1104, 2009
    • (2009) Proc. of Adv. Neural Inf. Process. Syst , vol.22 , pp. 1096-1104
    • Lee, H.1    Pham, P.2    Largman, Y.3    Ng, A.4
  • 14
    • 85007207023 scopus 로고    scopus 로고
    • Exploring hierarchical speech representations using a deep convolutional neural network
    • D. Hau and K. Chen, "Exploring hierarchical speech representations using a deep convolutional neural network," Proc. of 11th UK Workshop Comput. Intell. (UKCI '11), 2011
    • (2011) Proc. of 11th UK Workshop Comput. Intell. (UKCI '11)
    • Hau, D.1    Chen, K.2
  • 16
    • 84964511330 scopus 로고    scopus 로고
    • Single channel blind dereverberation based on auto-correlation functions of frame-wise time sequences of frequency components
    • K. Ohta and M. Yanagida, "Single channel blind dereverberation based on auto-correlation functions of frame-wise time sequences of frequency components," Proc. of IWAENC, pp. 1-4, 2006
    • (2006) Proc. of IWAENC , pp. 1-4
    • Ohta, K.1    Yanagida, M.2
  • 18
    • 33646677283 scopus 로고    scopus 로고
    • Experimental framework for the performance evaluation of speech recognition front-ends on a large vocabulary task
    • June 4
    • G. Hirsch, "Experimental framework for the performance evaluation of speech recognition front-ends on a large vocabulary task," ETSI STQ-Aurora DSR Working Group, June 4, 2001
    • (2001) ETSI STQ-Aurora DSR Working Group
    • Hirsch, G.1
  • 19
    • 0028996854 scopus 로고
    • WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition
    • T. Robinson, J. Fransen, D. Pye, J. Foote, and S. Renals, "WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition," Proc. ICASSP, pp. 81-84, 1995
    • (1995) Proc. ICASSP , pp. 81-84
    • Robinson, T.1    Fransen, J.2    Pye, D.3    Foote, J.4    Renals, S.5
  • 21
    • 84906260861 scopus 로고    scopus 로고
    • Damped oscillator cepstral coefficients for robust speech recognition
    • V. Mitra, H. Franco, and M. Graciarena, "Damped oscillator cepstral coefficients for robust speech recognition," Proc. of Interspeech, pp. 886-890, 2013
    • (2013) Proc. of Interspeech , pp. 886-890
    • Mitra, V.1    Franco, H.2    Graciarena, M.3
  • 22
    • 84867589420 scopus 로고    scopus 로고
    • Normalized amplitude modulation features for large vocabulary noise-robust speech recognition
    • V. Mitra, H. Franco, M. Graciarena, and A. Mandal, "Normalized amplitude modulation features for large vocabulary noise-robust speech recognition," Proc. of ICASSP, pp. 4117-4120, 2012
    • (2012) Proc. of ICASSP , pp. 4117-4120
    • Mitra, V.1    Franco, H.2    Graciarena, M.3    Mandal, A.4
  • 23
    • 0028287770 scopus 로고
    • Effect of reducing slow temporal modulations on speech reception
    • R. Drullman, J. M. Festen, and R. Plomp, "Effect of reducing slow temporal modulations on speech reception", J. Acoust. Soc. of Am., vol. 95, no. 5, pp. 2670-2680, 1994
    • (1994) J. Acoust. Soc. of Am , vol.95 , Issue.5 , pp. 2670-2680
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 24
    • 84905269267 scopus 로고    scopus 로고
    • Medium duration modulation cepstral feature for robust speech recognition
    • Florence
    • V. Mitra, H. Franco, M. Graciarena, and D. Vergyri, "Medium duration modulation cepstral feature for robust speech recognition," Proc. of ICASSP, Florence, 2014
    • (2014) Proc. of ICASSP
    • Mitra, V.1    Franco, H.2    Graciarena, M.3    Vergyri, D.4
  • 25
    • 0019075685 scopus 로고
    • Some observations on oral air flow during phonation
    • H. Teager, "Some observations on oral air flow during phonation," in IEEE Trans. ASSP, pp. 599-601, 1980
    • (1980) IEEE Trans. ASSP , pp. 599-601
    • Teager, H.1
  • 27
    • 84964515036 scopus 로고    scopus 로고
    • The automatic speech recognition in reverberant environments (aspire) challenge
    • M. Harper, "The Automatic Speech recognition In Reverberant Environments (ASpIRE) Challenge," Proc. of ASRU, 2015
    • (2015) Proc. of ASRU
    • Harper, M.1
  • 28
    • 84874226579 scopus 로고    scopus 로고
    • Adaptation of context-dependent deep neural networks for automatic speech recognition
    • K. Yao, D. Yu, F. Seide, H. Su, L. Deng, Y Gong, "Adaptation Of Context-Dependent Deep Neural Networks For Automatic Speech Recognition," Proc. of SLT 2012
    • (2012) Proc. of SLT
    • Yao, K.1    Yu, D.2    Seide, F.3    Su, H.4    Deng, L.5    Gong, Y.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.