메뉴 건너뛰기




Volumn 2015-January, Issue , 2015, Pages 1888-1892

Time-frequency kernel-based CNN for speech recognition

Author keywords

Convolutional neural network; Robust speech recognition; Time frequency kernels

Indexed keywords

CONVOLUTION; FREQUENCY DOMAIN ANALYSIS; NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION; TELEPHONE SETS;

EID: 84959087712     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (12)

References (20)
  • 2
    • 84890478854 scopus 로고    scopus 로고
    • Multiframe deep neural networks for acoustic modeling
    • Vanhoucke, V., Devin, M. and Heigold, G., "Multiframe deep neural networks for acoustic modeling", in Proc. ICASSP, 7582-7585, 2013.
    • (2013) Proc. ICASSP , pp. 7582-7585
    • Vanhoucke, V.1    Devin, M.2    Heigold, G.3
  • 3
    • 84892184434 scopus 로고    scopus 로고
    • Perceptual processing of speech and other perceptual patterns: Some similarities and differences
    • Greenberg S. and Ainsworth, W., Ed, Oxford University Press
    • Warren, R. M., "Perceptual processing of speech and other perceptual patterns: Some similarities and differences", in Greenberg S. and Ainsworth, W., Ed. Listening to Speech: An Auditory Perspective, Oxford University Press, 1998.
    • (1998) Listening to Speech: An Auditory Perspective
    • Warren, R.M.1
  • 7
    • 84906214784 scopus 로고    scopus 로고
    • Exploring convolutional neural network structures and optimization techniques for speech recognition
    • Abdel-Hamid, O., Deng, L. and Yu, D., "Exploring convolutional neural network structures and optimization techniques for speech recognition", in Proc. Interspeech, 3366-3370, 2013.
    • (2013) Proc. Interspeech , pp. 3366-3370
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 8
    • 84906276981 scopus 로고    scopus 로고
    • Convolutional deep rectifier neural nets for phone recognition
    • Toth, L., "Convolutional deep rectifier neural nets for phone recognition", in Proc. Interspeech, 1722-1726, 2013.
    • (2013) Proc. Interspeech , pp. 1722-1726
    • Toth, L.1
  • 10
    • 84905252069 scopus 로고    scopus 로고
    • Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition
    • Toth, L., "Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition", in Proc. ICASSP, 190-194, 2014.
    • (2014) Proc. ICASSP , pp. 190-194
    • Toth, L.1
  • 11
    • 0028516073 scopus 로고
    • How do humans process and recognise speech
    • Allen, J., "How Do Humans Process and Recognise Speech", IEEE Trans. Speech and Audio Proc., 2 (4): 567-577, 1994.
    • (1994) IEEE Trans. Speech and Audio Proc. , vol.2 , Issue.4 , pp. 567-577
    • Allen, J.1
  • 12
    • 84892186467 scopus 로고    scopus 로고
    • Incorporating information from syllable-length time scales into automatic speech recognition
    • Wu S., Kingsbury B., Mongan N. and Greenberg S., "Incorporating information from syllable-length time scales into automatic speech recognition", in Proc. ICASSP, 721-724, 1998.
    • (1998) Proc. ICASSP , pp. 721-724
    • Wu, S.1    Kingsbury, B.2    Mongan, N.3    Greenberg, S.4
  • 13
    • 0031643048 scopus 로고    scopus 로고
    • Multi-resolution cepstral features for phoneme recognition across speech sub-bands
    • McCourt, P., Vaseghi, S. and Harte, N., "Multi-resolution cepstral features for phoneme recognition across speech sub-bands", in Proc. ICASSP, 557-560, 1998.
    • (1998) Proc. ICASSP , pp. 557-560
    • McCourt, P.1    Vaseghi, S.2    Harte, N.3
  • 14
    • 84959136712 scopus 로고    scopus 로고
    • Microsoft Corporation, Redmond, WA, USA, accessed on 04 Mar
    • "The Computational Network Toolkit (CNTK)", Microsoft Corporation, Redmond, WA, USA. Online: https: //cntk. codeplex. com/SourceControl/latest, accessed on 04 Mar. 2015.
    • (2015) The Computational Network Toolkit (CNTK)
  • 15
    • 84976206655 scopus 로고    scopus 로고
    • https: //catalog. ldc. upenn. edu/docs/LDC96S32/FFMTIMIT. TXT
  • 16
    • 0002263996 scopus 로고
    • Convolutional networks for images, speech and time series
    • Arbib, M. A., Ed., MIT Press, 255-258
    • LeCun, Y. and Bengio Y., "Convolutional networks for images, speech and time series", in Arbib, M. A., Ed., The Handbook of Brain Theory and Neural Networks, MIT Press, 255-258, 1995.
    • (1995) The Handbook of Brain Theory and Neural Networks
    • LeCun, Y.1    Bengio, Y.2
  • 17
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural network concepts to hybrid NNHMM models for speech recognition
    • Abdel-Hamid, O., Mohamed, A., Jiang, H. and Penn, G., "Applying convolutional neural network concepts to hybrid NNHMM models for speech recognition", in Proc. ICASSP, 4277-4280, 2012.
    • (2012) Proc. ICASSP , pp. 4277-4280
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 18
    • 0024768209 scopus 로고
    • Speaker-independent phone recognition using hidden markov models
    • Lee, K. and Hon, H., "Speaker-Independent Phone Recognition Using Hidden Markov Models", IEEE Trans. Audio, Speech, Signal Proc. 37 (11): 1641-1648, 1989.
    • (1989) IEEE Trans. Audio, Speech, Signal Proc. , vol.37 , Issue.11 , pp. 1641-1648
    • Lee, K.1    Hon, H.2
  • 19
    • 84959111923 scopus 로고    scopus 로고
    • CUED Machine Intelligence Lab. Cambridge, UK. Online
    • "The Hidden Markov Model Toolkit (HTK)", CUED Machine Intelligence Lab. Cambridge, UK. Online: http: //htk. eng. cam. ac. uk/ftp/software/HTK-3. 4. 1. Tar. gz, accessed on 28 Jun. 2013.
    • (2013) The Hidden Markov Model Toolkit (HTK), Accessed on 28 Jun


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.