SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 11-15

Analysis of CNN-based speech recognition system using raw speech as input

(3) Palaz, Dimitri a,b Magimai Doss, Mathew a Collobert, Ronan a,c

a IDIAP RESEARCH INSTITUTE (Switzerland)

b EPFL (Switzerland)

c FACEBOOK AI RESEARCH (United States)

Author keywords

Automatic speech recognition; Convolutional neural networks; Raw signal; Robust speech recognition

Indexed keywords

CONVOLUTION; FEATURE EXTRACTION; NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION; TELEPHONE SETS;

AUTOMATIC SPEECH RECOGNITION; AUTOMATIC SPEECH RECOGNITION SYSTEM; CLASSIFIER TRAINING; CONVOLUTIONAL NEURAL NETWORK; RAW SIGNALS; ROBUST SPEECH RECOGNITION; SPECTRAL ENVELOPES; SPEECH RECOGNITION SYSTEMS;

SPEECH RECOGNITION;

EID: 84955059475 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (169)

References (27)

1
- 0003573244
- Springer
- H. Bourlard and N. Morgan, Connectionist speech recognition: A hybrid approach. Springer, 1994, vol. 247.
- (1994) Connectionist Speech Recognition: A Hybrid Approach , vol.247
- Bourlard, H.¹ Morgan, N.²

2
- 33745805403
- A fast learning algorithm for deep belief nets
- G. E. Hinton, S. Osindero, and Y. W. Teh, "A fast learning algorithm for deep belief nets, " Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.W.³

3
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, and T. N. Sainath, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, p. 8297, 2012.
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 8297
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

4
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks, " in Proc. of Interspeech, 2011, pp. 437-440.
- (2011) Proc. of Interspeech , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

5
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, p. 3042, 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 3042
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

6
- 84055211743
- Acoustic modeling using deep belief networks
- jan.
- A. Mohamed, G. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 14-22, jan. 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

7
- 84863380535
- Unsupervised feature learning for audio classification using convolutional deep belief networks
- H. Lee, P. Pham, Y. Largman, and A. Y. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks, " in Advances in Neural Information Processing Systems 22, 2009, pp. 1096-1104.
- (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 1096-1104
- Lee, H.¹ Pham, P.² Largman, Y.³ Ng, A.Y.⁴

8
- 84867605836
- Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " in Proc. of ICASSP, 2012, pp. 4277-4280.
- (2012) Proc. of ICASSP , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

9
- 84890525984
- Deep convolutional neural networks for lvcsr
- T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for lvcsr, " in Proc. of ICASSP, 2013, pp. 8614-8618.
- (2013) Proc. of ICASSP , pp. 8614-8618
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

10
- 84901999583
- Convolutional neural networks for distant speech recognition
- September
- P. Swietojanski, A. Ghoshal, and S. Renals, "Convolutional neural networks for distant speech recognition, " Signal Processing Letters, IEEE, vol. 21, no. 9, pp. 1120-1124, September 2014.
- (2014) Signal Processing Letters, IEEE , vol.21 , Issue.9 , pp. 1120-1124
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

11
- 84890543873
- Investigating deep neural network based transforms of robust audio features for lvcsr
- E. Bocchieri and D. Dimitriadis, "Investigating deep neural network based transforms of robust audio features for lvcsr, " in Proc. of ICASSP, 2013, pp. 6709-6713.
- (2013) Proc. of ICASSP , pp. 6709-6713
- Bocchieri, E.¹ Dimitriadis, D.²

12
- 80051609011
- Learning a better representation of speech soundwaves using restricted boltzmann machines
- N. Jaitly and G. Hinton, "Learning a better representation of speech soundwaves using restricted boltzmann machines, " in Proc. of ICASSP, 2011, pp. 5884-5887.
- (2011) Proc. of ICASSP , pp. 5884-5887
- Jaitly, N.¹ Hinton, G.²

13
- 84893688455
- Learning filter banks within a deep neural network framework
- Dec.
- T. Sainath, B. Kingsbury, A.-R. Mohamed, and B. Ramabhadran, "Learning filter banks within a deep neural network framework, " in Proc. of ASRU, Dec. 2013, pp. 297-302.
- (2013) Proc. of ASRU , pp. 297-302
- Sainath, T.¹ Kingsbury, B.² Mohamed, A.-R.³ Ramabhadran, B.⁴

14
- 84910065702
- Acoustic modeling with deep neural networks using raw time signal for lvcsr
- Singapore, Sep.
- Z. Tüske, P. Golik, R. Schlüter, and H. Ney, "Acoustic modeling with deep neural networks using raw time signal for lvcsr, " in Proc. of Interspeech, Singapore, Sep. 2014, pp. 890-894.
- (2014) Proc. of Interspeech , pp. 890-894
- Tüske, Z.¹ Golik, P.² Schlüter, R.³ Ney, H.⁴

15
- 84936133512
- ArXiv E-prints, Dec.
- D. Palaz, R. Collobert, and M. Magimai.-Doss, "End-to-end Phoneme Sequence Recognition using Convolutional Neural Networks, " ArXiv e-prints, Dec. 2013.
- (2013) End-to-end phoneme sequence recognition using convolutional neural networks
- Palaz, D.¹ Collobert, R.² Magimai Doss, M.³

16
- 84906273908
- Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
- D. Palaz, R. Collobert, and M. Magimai.-Doss, "Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks, " in Proc. of Interspeech, 2013.
- (2013) Proc. of Interspeech
- Palaz, D.¹ Collobert, R.² Magimai-Doss, M.³

17
- 84946023646
- Convolutional neural networks-based continuous speech recognition using raw speech signal
- April
- D. Palaz, M. Magimai.-Doss, and R. Collobert, "Convolutional neural networks-based continuous speech recognition using raw speech signal, " in Proc. of ICASSP, April 2015.
- (2015) Proc. of ICASSP
- Palaz, D.¹ Magimai-Doss, M.² Collobert, R.³

18
- 0002291365
- Generalization and network design strategies
- R. Pfeifer, Z. Schreter, F. Fogelman, and L. Steels, Eds. Zurich, Switzerland: Elsevier
- Y. LeCun, "Generalization and network design strategies, " in Connectionism in Perspective, R. Pfeifer, Z. Schreter, F. Fogelman, and L. Steels, Eds. Zurich, Switzerland: Elsevier, 1989.
- (1989) Connectionism in Perspective
- LeCun, Y.¹

19
- 0000583248
- Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition
- J. Bridle, "Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition, " in Neuro-computing: Algorithms, Architectures and Applications, 1990, pp. 227-236.
- (1990) Neuro-computing: Algorithms, Architectures and Applications , pp. 227-236
- Bridle, J.¹

20
- 33847215211
- Stochastic gradient learning in neural networks
- Nimes, France: EC2
- L. Bottou, "Stochastic gradient learning in neural networks, " in Proceedings of Neuro-Nmes 91. Nimes, France: EC2, 1991.
- (1991) Proceedings of Neuro-Nmes , vol.91
- Bottou, L.¹

21
- 0024768209
- Speaker-independent phone recognition using hidden markov models
- K. F. Lee and H. W. Hon, "Speaker-independent phone recognition using hidden markov models, " IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, no. 11, pp. 1641-1648, 1989.
- (1989) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.37 , Issue.11 , pp. 1641-1648
- Lee, K.F.¹ Hon, H.W.²

22
- 0027623210
- Assessment for automatic speech recognition: II. Noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems
- A. Varga and H. J. Steeneken, "Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems, " Speech communication, vol. 12, no. 3, pp. 247-251, 1993.
- (1993) Speech Communication , vol.12 , Issue.3 , pp. 247-251
- Varga, A.¹ Steeneken, H.J.²

23
- 79551573428
- H.-G. Hirsch, "Fant-filtering and noise adding tool, " 2005.
- (2005) Fant-filtering and Noise Adding Tool
- Hirsch, H.-G.¹

24
- 84890497765
- The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- H.-G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, " in ASR2000-Automatic Speech Recognition: Challenges for the new Millenium ISCA Tutorial and Research Workshop (ITRW), 2000.
- (2000) ASR2000-Automatic Speech Recognition: Challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW)
- Hirsch, H.-G.¹ Pearce, D.²

25
- 79955989999
- The htk book
- S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, "The htk book, " Cambridge University Engineering Department, vol. 3, 2002.
- (2002) Cambridge University Engineering Department , vol.3
- Young, S.¹ Evermann, G.² Kershaw, D.³ Moore, G.⁴ Odell, J.⁵ Ollason, D.⁶ Valtchev, V.⁷ Woodland, P.⁸

26
- 84959120572
- BigLearn, NIPS Workshop
- R. Collobert, K. Kavukcuoglu, and C. Farabet, "Torch7: A matlab-like environment for machine learning, " in BigLearn, NIPS Workshop, 2011.
- (2011) Torch7: A Matlab-like Environment for Machine Learning
- Collobert, R.¹ Kavukcuoglu, K.² Farabet, C.³

27
- 80051648777
- Tech. Rep., version 1. 1
- H.-G. Hirsch and D. Pearce, "Applying the advanced ETSI frontend to the aurora-2 task, " Tech. Rep., 2006, version 1. 1.
- (2006) Applying the Advanced ETSI Frontend to the aurora-2 Task
- Hirsch, H.-G.¹ Pearce, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.