SCOPUS 정보 검색 플랫폼

2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings

Volumn , Issue , 2013, Pages 297-302

Learning filter banks within a deep neural network framework

(4) Sainath, Tara N a Kingsbury, Brian a Mohamed, Abdel Rahman b Ramabhadran, Bhuvana a

a IBM T J WATSON RESEARCH CENTER (United States)

b UNIVERSITY OF TORONTO (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

BROADCAST NEWS; CROSS ENTROPY; DEEP NEURAL NETWORKS; LEARNING APPROACH; LEARNING FILTERS; MEL-FILTER BANKS; SPEECH PRODUCTION; WORD ERROR RATE;

SPEECH RECOGNITION;

FILTER BANKS;

EID: 84893688455 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ASRU.2013.6707746 Document Type: Conference Paper

Times cited : (151)

References (21)

1
- 84893659646
- T. N. Sainath, B. Ramabhadran, M. Picheny, D. Nahamoo, and D. Kanevsky, "Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR, " 2011.
- (2011) Exemplar-Based Sparse Representation Features: from TIMIT to LVCSR
- Sainath, T.N.¹ Ramabhadran, B.² Picheny, M.³ Nahamoo, D.⁴ Kanevsky, D.⁵

2
- 84867711674
- Learning invariant feature hierarchies
- vol. 7583 of Lecture Notes in Computer Science, Springer
- Y. LeCun, "Learning Invariant Feature Hierarchies, " in European Conference on Computer Vision (ECCV). 2012, vol. 7583 of Lecture Notes in Computer Science, pp. 496-505, Springer.
- (2012) European Conference on Computer Vision (ECCV) , pp. 496-505
- Lecun, Y.¹

3
- 84867585919
- Understanding how deep belief networks perform acoustic modelling
- A. Mohamed, G. Hinton, and G. Penn, "Understanding how Deep Belief Networks Perform Acoustic Modelling, " in ICASSP, 2012.
- (2012) ICASSP
- Mohamed, A.¹ Hinton, G.² Penn, G.³

4
- 84858972572
- Making deep belief networks effective for large vocabulary continuous speech recognition
- T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, "Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition, " in Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, P.⁵ Mohamed, A.⁶

5
- 0002263996
- Convolutional networks for images, speech, and time-series
- MIT Press
- Y. LeCun and Y. Bengio, "Convolutional Networks for Images, Speech, and Time-series, " in The Handbook of Brain Theory and Neural Networks. MIT Press, 1995.
- (1995) The Handbook of Brain Theory and Neural Networks
- Lecun, Y.¹ Bengio, Y.²

6
- 84890525984
- Deep convolutional neural networks for LVCSR
- T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep Convolutional Neural Networks for LVCSR, " in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

7
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- S. Davis and P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences, " IEEE Transacations on Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357 - 366, 1980. (Pubitemid 11464930)
- (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

8
- 79953288449
- Data driven design of filter bank for speech recognition
- Text, Speech and Dialogue
- L. Burget and H. Hěrmansk̀y, "Data Driven Design of Filter Bank for Speech Recognition, " in Text, Speech and Dialogue. Springer, 2001, pp. 299-304. (Pubitemid 33329242)
- (2001) Lecture Notes in Computer Science , Issue.2166 , pp. 299-304
- Burget, L.¹ Hermansky, H.²

9
- 64849109603
- Data-driven filter-bank-based feature extraction for speech recognition
- Y. Suh and H. Kim, "Data-Driven Filter-Bank-based Feature Extraction for Speech Recognition, " in Proc. SPECOM, 2004.
- (2004) Proc. SPECOM
- Suh, Y.¹ Kim, H.²

10
- 0000551146
- A discriminative filter bank model for speech recognition
- A. Biem, E. Mcdermott, and S. Katagiri, "A Discriminative Filter Bank Model For Speech Recognition, " in Proc. ICASSP, 1995.
- (1995) Proc. ICASSP
- Biem, A.¹ Mcdermott, E.² Katagiri, S.³

11
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, "Deep Neural Networks for Acoustic Modeling in Speech Recognition, " IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Kingsbury, B.¹¹

12
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- B. Kingsbury, "Lattice-Based Optimization of Sequence Classification Criteria for Neural-Network Acoustic Modeling, " in Proc. ICASSP, 2009.
- (2009) Proc. ICASSP
- Kingsbury, B.¹

13
- 0001857994
- Efficient backprop
- G. Orr and Muller K. Eds., Springer
- Y. LeCun, L. Bottou, G. Orr, and K. Muller, "Efficient Backprop, " in Neural Networks: Tricks of the Trrade, G. Orr and Muller K., Eds. 1998, Springer.
- (1998) Neural Networks: Tricks of the Trrade
- Lecun, Y.¹ Bottou, L.² Orr, G.³ Muller, K.⁴

14
- 79951796005
- The ibm attila speech recognition toolkit
- H. Soltau, G. Saon, and B. Kingsbury, "The IBM Attila Speech Recognition Toolkit, " in Proc. SLT, 2010.
- (2010) Proc. SLT
- Soltau, H.¹ Saon, G.² Kingsbury, B.³

15
- 84867605836
- Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying Convolutional Neural Network Concepts to Hybrid NN-HMM Model for Speech Recognition, " in Proc. ICASSP, 2012.
- (2012) Proc. ICASSP
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

16
- 0141853652
- Learning representations by back-propagating errors
- D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors, " Neurocomputing: foundations of research, pp. 696-699, 1988.
- (1988) Neurocomputing: Foundations of Research , pp. 696-699
- Rumelhart, D.E.¹ Hinton, G.E.² Williams, R.J.³

17
- 0028517164
- Rasta processing of speech
- H. Hermansky and N. Morgan, "RASTA Processing of Speech, " IEEE Transactions on Speech and Audio Processing, vol. 2, no. 4, pp. 578 - 589, 1994.
- (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.4 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

18
- 84890545163
- A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
- L. Deng, O. Abdel-Hamid, and D. Yu, "A Deep Convolutional Neural Network using Heterogeneous Pooling for Trading Acoustic Invariance with Phonetic Confusion, " in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Deng, L.¹ Abdel-Hamid, O.² Yu, D.³

19
- 0029747183
- Speaker normalization using efficient frequency warping procedures
- L. Lee and R. C. Rose, "Speaker Normalization using Efficient Frequency Warping Procedures, " in Proc. ICASSP, 1996.
- (1996) Proc. ICASSP
- Lee, L.¹ Rose, R.C.²

20
- 84890466217
- Improving neural networks by preventing co- Adaptation of feature detectors
- vol. 1207.0580
- G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Improving Neural Networks by Preventing Co- Adaptation of Feature Detectors, " The Computing Research Repository (CoRR), vol. 1207.0580, 2012.
- (2012) The Computing Research Repository (CoRR)
- Hinton, G.E.¹ Srivastava, N.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

21
- 0028496580
- Weight smoothing to improve network generalization
- J. Jean and Jin Wang, "Weight Smoothing to Improve Network Generalization, " Neural Networks, IEEE Transactions on, vol. 5, no. 5, pp. 752-763, 1994.
- (1994) Neural Networks, IEEE Transactions on , vol.5 , Issue.5 , pp. 752-763
- Jean, J.¹ Wang, J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.