메뉴 건너뛰기




Volumn 2015-January, Issue , 2015, Pages 11-15

Analysis of CNN-based speech recognition system using raw speech as input

Author keywords

Automatic speech recognition; Convolutional neural networks; Raw signal; Robust speech recognition

Indexed keywords

CONVOLUTION; FEATURE EXTRACTION; NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION; TELEPHONE SETS;

EID: 84955059475     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (169)

References (27)
  • 2
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G. E. Hinton, S. Osindero, and Y. W. Teh, "A fast learning algorithm for deep belief nets, " Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.W.3
  • 4
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks, " in Proc. of Interspeech, 2011, pp. 437-440.
    • (2011) Proc. of Interspeech , pp. 437-440
    • Seide, F.1    Li, G.2    Yu, D.3
  • 5
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, p. 3042, 2012.
    • (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 3042
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 7
    • 84863380535 scopus 로고    scopus 로고
    • Unsupervised feature learning for audio classification using convolutional deep belief networks
    • H. Lee, P. Pham, Y. Largman, and A. Y. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks, " in Advances in Neural Information Processing Systems 22, 2009, pp. 1096-1104.
    • (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 1096-1104
    • Lee, H.1    Pham, P.2    Largman, Y.3    Ng, A.Y.4
  • 8
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
    • O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " in Proc. of ICASSP, 2012, pp. 4277-4280.
    • (2012) Proc. of ICASSP , pp. 4277-4280
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 10
    • 84901999583 scopus 로고    scopus 로고
    • Convolutional neural networks for distant speech recognition
    • September
    • P. Swietojanski, A. Ghoshal, and S. Renals, "Convolutional neural networks for distant speech recognition, " Signal Processing Letters, IEEE, vol. 21, no. 9, pp. 1120-1124, September 2014.
    • (2014) Signal Processing Letters, IEEE , vol.21 , Issue.9 , pp. 1120-1124
    • Swietojanski, P.1    Ghoshal, A.2    Renals, S.3
  • 11
    • 84890543873 scopus 로고    scopus 로고
    • Investigating deep neural network based transforms of robust audio features for lvcsr
    • E. Bocchieri and D. Dimitriadis, "Investigating deep neural network based transforms of robust audio features for lvcsr, " in Proc. of ICASSP, 2013, pp. 6709-6713.
    • (2013) Proc. of ICASSP , pp. 6709-6713
    • Bocchieri, E.1    Dimitriadis, D.2
  • 12
    • 80051609011 scopus 로고    scopus 로고
    • Learning a better representation of speech soundwaves using restricted boltzmann machines
    • N. Jaitly and G. Hinton, "Learning a better representation of speech soundwaves using restricted boltzmann machines, " in Proc. of ICASSP, 2011, pp. 5884-5887.
    • (2011) Proc. of ICASSP , pp. 5884-5887
    • Jaitly, N.1    Hinton, G.2
  • 13
    • 84893688455 scopus 로고    scopus 로고
    • Learning filter banks within a deep neural network framework
    • Dec.
    • T. Sainath, B. Kingsbury, A.-R. Mohamed, and B. Ramabhadran, "Learning filter banks within a deep neural network framework, " in Proc. of ASRU, Dec. 2013, pp. 297-302.
    • (2013) Proc. of ASRU , pp. 297-302
    • Sainath, T.1    Kingsbury, B.2    Mohamed, A.-R.3    Ramabhadran, B.4
  • 14
    • 84910065702 scopus 로고    scopus 로고
    • Acoustic modeling with deep neural networks using raw time signal for lvcsr
    • Singapore, Sep.
    • Z. Tüske, P. Golik, R. Schlüter, and H. Ney, "Acoustic modeling with deep neural networks using raw time signal for lvcsr, " in Proc. of Interspeech, Singapore, Sep. 2014, pp. 890-894.
    • (2014) Proc. of Interspeech , pp. 890-894
    • Tüske, Z.1    Golik, P.2    Schlüter, R.3    Ney, H.4
  • 16
    • 84906273908 scopus 로고    scopus 로고
    • Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
    • D. Palaz, R. Collobert, and M. Magimai.-Doss, "Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks, " in Proc. of Interspeech, 2013.
    • (2013) Proc. of Interspeech
    • Palaz, D.1    Collobert, R.2    Magimai-Doss, M.3
  • 17
    • 84946023646 scopus 로고    scopus 로고
    • Convolutional neural networks-based continuous speech recognition using raw speech signal
    • April
    • D. Palaz, M. Magimai.-Doss, and R. Collobert, "Convolutional neural networks-based continuous speech recognition using raw speech signal, " in Proc. of ICASSP, April 2015.
    • (2015) Proc. of ICASSP
    • Palaz, D.1    Magimai-Doss, M.2    Collobert, R.3
  • 18
    • 0002291365 scopus 로고
    • Generalization and network design strategies
    • R. Pfeifer, Z. Schreter, F. Fogelman, and L. Steels, Eds. Zurich, Switzerland: Elsevier
    • Y. LeCun, "Generalization and network design strategies, " in Connectionism in Perspective, R. Pfeifer, Z. Schreter, F. Fogelman, and L. Steels, Eds. Zurich, Switzerland: Elsevier, 1989.
    • (1989) Connectionism in Perspective
    • LeCun, Y.1
  • 19
    • 0000583248 scopus 로고
    • Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition
    • J. Bridle, "Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition, " in Neuro-computing: Algorithms, Architectures and Applications, 1990, pp. 227-236.
    • (1990) Neuro-computing: Algorithms, Architectures and Applications , pp. 227-236
    • Bridle, J.1
  • 20
    • 33847215211 scopus 로고
    • Stochastic gradient learning in neural networks
    • Nimes, France: EC2
    • L. Bottou, "Stochastic gradient learning in neural networks, " in Proceedings of Neuro-Nmes 91. Nimes, France: EC2, 1991.
    • (1991) Proceedings of Neuro-Nmes , vol.91
    • Bottou, L.1
  • 21
  • 22
    • 0027623210 scopus 로고
    • Assessment for automatic speech recognition: II. Noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems
    • A. Varga and H. J. Steeneken, "Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems, " Speech communication, vol. 12, no. 3, pp. 247-251, 1993.
    • (1993) Speech Communication , vol.12 , Issue.3 , pp. 247-251
    • Varga, A.1    Steeneken, H.J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.