SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 890-894

Acoustic modeling with deep neural networks using raw time signal for LVCSR

(4) Tüske, Zoltán a Golik, Pavel a Schlüter, Ralf a Ney, Hermann a,b

a RWTH AACHEN UNIVERSITY (Germany)

b UFR 919 Laboratoire d'Informatique Pour la Mécanique et les Sciences de l'Ingénieur (France)

Author keywords

Acoustic modeling; Neural networks; Raw signal

Indexed keywords

EXTRACTION; FEATURE EXTRACTION; FILTER BANKS; NEURAL NETWORKS; SIGNAL PROCESSING; SPEECH COMMUNICATION; TIME DOMAIN ANALYSIS;

ACOUSTIC MODEL; AUTOMATIC SPEECH RECOGNITION; DEEP NEURAL NETWORKS; MAGNITUDE SPECTRUM; MULTI-RESOLUTIONAL ANALYSIS; RAW SIGNALS; RECOGNITION ACCURACY; SIGMOID ACTIVATION FUNCTION;

SPEECH RECOGNITION;

EID: 84910065702 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (176)

References (20)

1
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- Hawaii, USA, Dec
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Hawaii, USA, Dec. 2011, pp. 24-29.
- (2011) Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

2
- 85083953021
- Feature learning in deep neural networks - A study on speech recognition tasks
- Scottsdale, AZ, USA, May
- D. Yu, M. L. Seltzer, J. Li, J.-T. Huang, and F. Seide, "Feature learning in deep neural networks - A study on speech recognition tasks, " in International Conference on Learning Representations, Scottsdale, AZ, USA, May 2013.
- (2013) International Conference on Learning Representations
- Yu, D.¹ Seltzer, M.L.² Li, J.³ Huang, J.-T.⁴ Seide, F.⁵

3
- 0003573244
- Norwell, MA, USA: Kluwer Academic Publishers
- H. A. Bourlard and N. Morgan, Connectionist speech recognition: A hybrid approach. Norwell, MA, USA: Kluwer Academic Publishers, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.A.¹ Morgan, N.²

4
- 0024861871
- Approximation by superpositions of a sigmoidal function
- G. Cybenko, "Approximation by superpositions of a sigmoidal function, " Mathematics of Control, Signals and Systems, vol. 2, no. 4, pp. 303-314, 1989.
- (1989) Mathematics of Control, Signals and Systems , vol.2 , Issue.4 , pp. 303-314
- Cybenko, G.¹

5
- 0024880831
- Multilayer feedforward networks are universal approximators
- Jul
- K. Hornik, M. B. Stinchcombe, and H. White, "Multilayer feedforward networks are universal approximators, " Neural Networks, vol. 2, no. 5, pp. 359-366, Jul. 1989.
- (1989) Neural Networks , vol.2 , Issue.5 , pp. 359-366
- Hornik, K.¹ Stinchcombe, M.B.² White, H.³

6
- 84893688455
- Learning filter banks within a deep neural network framework
- Olomouc, Czech Republic, Dec
- T. N. Sainath, B. Kingsbury, A.-r. Mohamed, and B. Ramabhadran, "Learning filter banks within a deep neural network framework, " in Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Olomouc, Czech Republic, Dec. 2013, pp. 297-302.
- (2013) Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pp. 297-302
- Sainath, T.N.¹ Kingsbury, B.² Mohamed, A.-R.³ Ramabhadran, B.⁴

7
- 84905233897
- Meannormalized stochastic gradient for large-scale deep learning
- Florence, Italy, May
- S. Wiesler, A. Richard, R. Schlüter, and H. Ney, "Meannormalized stochastic gradient for large-scale deep learning, " in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Florence, Italy, May 2014, pp. 180-184.
- (2014) Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 180-184
- Wiesler, S.¹ Richard, A.² Schlüter, R.³ Ney, H.⁴

8
- 84858985237
- Improved acoustic feature combination for LVCSR by neural networks
- Florence, Italy, Aug
- C. Plahl, R. Schlüter, and H. Ney, "Improved acoustic feature combination for LVCSR by neural networks, " in Proc. Interspeech, Florence, Italy, Aug. 2011, pp. 1237-1240.
- (2011) Proc. Interspeech , pp. 1237-1240
- Plahl, C.¹ Schlüter, R.² Ney, H.³

9
- 84906273908
- Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
- Lyon, France, Aug
- D. Palaz, R. Collobert, and M. Magimai.-Doss, "Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks, " in Proc. Interspeech, Lyon, France, Aug. 2013, pp. 1766-1770.
- (2013) Proc. Interspeech , pp. 1766-1770
- Palaz, D.¹ Collobert, R.² Magimai.-Doss, M.³

10
- 84867605836
- Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
- Kyoto, Japan, Mar
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Kyoto, Japan, Mar. 2012, pp. 4277-4280.
- (2012) Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

11
- 84985742249
- Linear predictive hidden Markov models and the speech signal
- Paris, France, May
- A. B. Poritz, "Linear predictive hidden Markov models and the speech signal, " in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, vol. 7, Paris, France, May 1982, pp. 1291- 1294.
- (1982) Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing , vol.7 , pp. 1291-1294
- Poritz, A.B.¹

12
- 13244265597
- Revisiting autoregressive hidden Markov modeling of speech signals
- Feb
- Y. Ephraim and W. J. J. Roberts, "Revisiting autoregressive hidden Markov modeling of speech signals, " IEEE Signal Processing Letters, vol. 12, no. 2, pp. 166-169, Feb. 2005.
- (2005) IEEE Signal Processing Letters , vol.12 , Issue.2 , pp. 166-169
- Ephraim, Y.¹ Roberts, W.J.J.²

13
- 84910063277
- Subband acoustic waveform front-end for robust speech recognition using support vector machines
- Brighton, UK, Sep
- J. Yousafzai, Z. Cvetkovíc, and P. Sollich, "Subband acoustic waveform front-end for robust speech recognition using support vector machines, " in Proc. Interspeech, Brighton, UK, Sep. 2009, pp. 2679-2682.
- (2009) Proc. Interspeech , pp. 2679-2682
- Yousafzai, J.¹ Cvetkovíc, Z.² Sollich, P.³

14
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Aug
- S. B. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, " IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 357-366, Aug. 1980.
- (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.28 , Issue.4 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

15
- 34547539413
- Gammatone features and feature combination for large vocabulary speech recognition
- Honolulu, Hawaii, USA, Apr
- R. Schlüter, I. Bezrukov, H. Wagner, and H. Ney, "Gammatone features and feature combination for large vocabulary speech recognition, " in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, USA, Apr. 2007, pp. 649- 652.
- (2007) Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 649-652
- Schlüter, R.¹ Bezrukov, I.² Wagner, H.³ Ney, H.⁴

16
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech, " Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752, 1990.
- (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

17
- 84906215167
- Quaero Programme. http://www.quaero.org.
- Quaero Programme

18
- 84878410921
- RASR - The RWTH Aachen university open source speech recognition toolkit
- Hawaii, USA, Dec
- D. Rybach, S. Hahn, P. Lehnen, D. Nolden, M. Sundermeyer, Z. Tüske, S. Wiesler, R. Schlüter, and H. Ney, "RASR - The RWTH Aachen university open source speech recognition toolkit, " in Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Hawaii, USA, Dec. 2011.
- (2011) Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- Rybach, D.¹ Hahn, S.² Lehnen, P.³ Nolden, D.⁴ Sundermeyer, M.⁵ Tüske, Z.⁶ Wiesler, S.⁷ Schlüter, R.⁸ Ney, H.⁹

19
- 77956509090
- Rectified linear units improve restricted Boltzmann machines
- Haifa, Israel, Jun
- V. Nair and G. E. Hinton, "Rectified linear units improve restricted Boltzmann machines, " in Proc. of the 27th Int. Conf. on Machine Learning, Haifa, Israel, Jun. 2010, pp. 807-814.
- (2010) Proc. of the 27th Int. Conf. on Machine Learning , pp. 807-814
- Nair, V.¹ Hinton, G.E.²

20
- 0025110885
- Derivation of auditory filter shapes from notched-noise data
- Aug
- B. R. Glasberg and B. C. J. Moore, "Derivation of auditory filter shapes from notched-noise data, " Hearing Research, vol. 47, no. 1-2, pp. 103-138, Aug. 1990.
- (1990) Hearing Research , vol.47 , Issue.1-2 , pp. 103-138
- Glasberg, B.R.¹ Moore, B.C.J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.