SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 800-804

Improving language-universal feature extraction with deep maxout and convolutional neural networks

(2) Miao, Yajie a Metze, Florian a

a Carnegie Mellon University (United States)

Author keywords

Deep convolutional networks; Deep maxout networks; Language universal feature extraction

Indexed keywords

COMPLEX NETWORKS; CONVOLUTION; EXTRACTION; FEATURE EXTRACTION; NEURAL NETWORKS; SPEECH COMMUNICATION;

AUTOMATED SPEECH RECOGNITION; CONVOLUTIONAL NETWORKS; CONVOLUTIONAL NEURAL NETWORK; DEEP NEURAL NETWORKS; LINEAR CLASSIFIERS; SIGMOID NONLINEARITY; UNIVERSAL FEATURE EXTRACTORS; WORD ERROR RATE REDUCTIONS;

SPEECH RECOGNITION;

EID: 84910028405 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (22)

References (35)

1
- 84055222005
- Contextdependent pre-trained deep neural networks for large vocabulary speech recognition
- G. Dahl, D. Yu, L. Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for large vocabulary speech recognition, " IEEE Transactions on Audio, Speech and Language Processing, vol. 20(1), pp. 30-42, 2012.
- (2012) IEEE Transactions on Audio, Speech and Language Processing , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

2
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. ASRU, pp. 24-29, 2011.
- (2011) Proc. ASRU , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

3
- 85083953021
- Feature learning in deep neural networks - studies on speech recognition tasks
- D. Yu, M. L. Seltzer, J. Li, J. Huang, and F. Seide, "Feature learning in deep neural networks - studies on speech recognition tasks, " in International Conference on Learning Representations 2013.
- (2013) International Conference on Learning Representations
- Yu, D.¹ Seltzer, M.L.² Li, J.³ Huang, J.⁴ Seide, F.⁵

4
- 84890527497
- Crosslanguage knowledge transfer using multilingual deep neural network with shared hidden layers
- J. Huang, J. Li, D. Yu, L. Deng, and Y. Gong, "Crosslanguage knowledge transfer using multilingual deep neural network with shared hidden layers, " in Proc. ICASSP, pp. 7304-7308, 2013.
- (2013) Proc. ICASSP , pp. 7304-7308
- Huang, J.¹ Li, J.² Yu, D.³ Deng, L.⁴ Gong, Y.⁵

5
- 84893701756
- Deep maxout networks for low-resource speech recognition
- Y. Miao, F. Metze, and S. Rawat, "Deep maxout networks for low-resource speech recognition, " in Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Miao, Y.¹ Metze, F.² Rawat, S.³

6
- 84910068044
- Distributed learning of multilingual DNN feature extractors using GPUs
- Y. Miao, H. Zhang, and F. Metze, "Distributed learning of multilingual DNN feature extractors using GPUs, " to appear in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Miao, Y.¹ Zhang, H.² Metze, F.³

7
- 85161980001
- Sparse deep belief net model for visual area V2
- H. Lee, C. Ekanadham, and A. Y. Ng, "Sparse deep belief net model for visual area V2, " in Proc. NIPS, 2008.
- (2008) Proc. NIPS
- Lee, H.¹ Ekanadham, C.² Ng, A.Y.³

8
- 71149119164
- Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
- H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations, " in Proc. ICML, pp. 609-616, 2009.
- (2009) Proc. ICML , pp. 609-616
- Lee, H.¹ Grosse, R.² Ranganath, R.³ Ng, A.Y.⁴

9
- 85162401749
- Sparse filtering
- J. Ngiam, P. Koh, Z. Chen, S. Bhaskar, and A. Y. Ng, "Sparse filtering, " in Proc. NIPS, 2013.
- (2013) Proc. NIPS
- Ngiam, J.¹ Koh, P.² Chen, Z.³ Bhaskar, S.⁴ Ng, A.Y.⁵

10
- 78049398611
- Sparse coding for speech recognition
- G. Sivaram, S.K. Nemala, M. Elhilali, T.D. Tran, and H. Hermansky, "Sparse coding for speech recognition, " in Proc. ICASSP, pp. 4346-4349, 2010.
- (2010) Proc. ICASSP , pp. 4346-4349
- Sivaram, G.¹ Nemala, S.K.² Elhilali, M.³ Tran, T.D.⁴ Hermansky, H.⁵

11
- 84863380535
- Unsupervised feature learning for audio classification using convolutional deep belief networks
- H. Lee, Y. Largman, P. Pham, and A. Y. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks, " in Proc. NIPS, 2009.
- (2009) Proc. NIPS
- Lee, H.¹ Largman, Y.² Pham, P.³ Ng, A.Y.⁴

12
- 80051612464
- Multilayer perceptron with sparse hidden outputs for phoneme recognition
- G. Sivaram, and H. Hermansky, "Multilayer perceptron with sparse hidden outputs for phoneme recognition, " in Proc. ICASSP, pp. 5336-5339, 2011.
- (2011) Proc. ICASSP , pp. 5336-5339
- Sivaram, G.¹ Hermansky, H.²

13
- 84878538214
- Are sparse representations rich enough for acoustic modeling?
- O. Vinyals, and L. Deng, "Are sparse representations rich enough for acoustic modeling?, " in Proc. Interspeech, 2012.
- (2012) Proc. Interspeech
- Vinyals, O.¹ Deng, L.²

14
- 84897543523
- Maxout networks
- I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, "Maxout networks, " in Proc. ICML, 2013.
- (2013) Proc. ICML
- Goodfellow, I.J.¹ Warde-Farley, D.² Mirza, M.³ Courville, A.⁴ Bengio, Y.⁵

15
- 84905239342
- Improving deep neural network acoustic models using generalized maxout networks
- X. Zhang, J. Trmal, D. Povey, and S. Khudanpur, "Improving deep neural network acoustic models using generalized maxout networks, " in Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Zhang, X.¹ Trmal, J.² Povey, D.³ Khudanpur, S.⁴

16
- 84867605836
- Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " in Proc. ICASSP, pp. 4277-4280, 2012.
- (2012) Proc. ICASSP , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

17
- 84890525984
- Deep convolutional neural networks for LVCSR
- T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for LVCSR, " in Proc. ICASSP, pp. 8614-8618, 2013.
- (2013) Proc. ICASSP , pp. 8614-8618
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

18
- 84906214784
- Exploring convolutional neural network structures and optimization techniques for speech recognition
- O. Abdel-Hamid, L. Deng, and D. Yu, "Exploring convolutional neural network structures and optimization techniques for speech recognition, " in Proc. Interspeech, pp. 3366-3370, 2013.
- (2013) Proc. Interspeech , pp. 3366-3370
- Abdel-Hamid, O.¹ Deng, L.² Yu, D.³

19
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- T. N. Sainath, B. Kingsbury, A. Mohamed, G. Dahl, G. Saon, H. Soltau, T. Beran, A. Aravkin, and B. Ramabhadran, "Improvements to deep convolutional neural networks for LVCSR, " in Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Sainath, T.N.¹ Kingsbury, B.² Mohamed, A.³ Dahl, G.⁴ Saon, G.⁵ Soltau, H.⁶ Beran, T.⁷ Aravkin, A.⁸ Ramabhadran, B.⁹

20
- 84906273176
- Modular combination of deep neural networks for acoustic modeling
- J. Gehring, W. Lee, K. Kilgour, I. Lane, Y. Miao, and A. Waibel, "Modular combination of deep neural networks for acoustic modeling, " in Proc. Interspeech, pp. 94-98, 2013.
- (2013) Proc. Interspeech , pp. 94-98
- Gehring, J.¹ Lee, W.² Kilgour, K.³ Lane, I.⁴ Miao, Y.⁵ Waibel, A.⁶

21
- 84890482429
- Extracting deep bottleneck features using stacked autoencoders
- J. Gehring, Y. Miao, F. Metze, and A. Waibel, "Extracting deep bottleneck features using stacked autoencoders, " in Proc. ICASSP, pp. 3377-3381, 2013.
- (2013) Proc. ICASSP , pp. 3377-3381
- Gehring, J.¹ Miao, Y.² Metze, F.³ Waibel, A.⁴

22
- 84906283232
- Using conversational word bursts in spoken term detection
- J. Chiu, and A. Rudnicky, "Using conversational word bursts in spoken term detection, " in Proc. Interspeech, 2013.
- (2013) Proc. Interspeech
- Chiu, J.¹ Rudnicky, A.²

23
- 84910068915
- Combination of FST and CN search in spoken term detection
- J. Chiu, Y. Wang, J. Trmal, D. Povey, G. Chen, and A. Rudnicky, "Combination of FST and CN search in spoken term detection, " to appear in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Chiu, J.¹ Wang, Y.² Trmal, J.³ Povey, D.⁴ Chen, G.⁵ Rudnicky, A.⁶

24
- 84906273501
- Improving low-resource CDDNN- HMM using dropout and multilingual DNN training
- Y. Miao, and F. Metze, "Improving low-resource CDDNN- HMM using dropout and multilingual DNN training, " in Proc. Interspeech, pp. 2237-2241, 2013.
- (2013) Proc. Interspeech , pp. 2237-2241
- Miao, Y.¹ Metze, F.²

25
- 84890495545
- Subspace mixture model for low-resource speech recognition in crosslingual settings
- Y. Miao, F. Metze, and A. Waibel, "Subspace mixture model for low-resource speech recognition in crosslingual settings, " in Proc. ICASSP, pp. 7339-7342, 2013.
- (2013) Proc. ICASSP , pp. 7339-7342
- Miao, Y.¹ Metze, F.² Waibel, A.³

26
- 84872555593
- Deep sparse rectifier neural networks
- X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier neural networks, " in Proc. AISTATS, 2011.
- (2011) Proc. AISTATS
- Glorot, X.¹ Bordes, A.² Bengio, Y.³

27
- 84890527827
- Improving deep neural networks for LVCSR using rectified linear units and dropout
- G. Dahl, T. N. Sainath, and G. E. Hinton, "Improving deep neural networks for LVCSR using rectified linear units and dropout, " in Proc. ICASSP, pp. 8609-8613, 2013.
- (2013) Proc. ICASSP , pp. 8609-8613
- Dahl, G.¹ Sainath, T.N.² Hinton, G.E.³

28
- 84890451371
- Phone recognition with deep sparse rectifier neural networks
- L. Toth, "Phone recognition with deep sparse rectifier neural networks, " in Proc. ICASSP, pp. 6985-6989, 2013.
- (2013) Proc. ICASSP , pp. 6985-6989
- Toth, L.¹

29
- 84905286094
- Rectifier nonlinearities improve neural network acoustic models
- A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models, " in ICML Workshop on Deep Learning for Audio, Speech, and Language Processing (WDLASL), 2013.
- (2013) ICML Workshop on Deep Learning for Audio, Speech, and Language Processing (WDLASL)
- Maas, A.L.¹ Hannun, A.Y.² Ng, A.Y.³

30
- 84910038371
- arXiv:1401.6984
- Y. Miao, "Kaldi+PDNN: Building DNN-based ASR systems with Kaldi and PDNN, " arXiv:1401.6984, 2014.
- (2014) Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN
- Miao, Y.¹

31
- 79551480483
- Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
- P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol, "Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion, " Journal of Machine Learning Research, vol. 11, 2010.
- (2010) Journal of Machine Learning Research , vol.11
- Vincent, P.¹ Larochelle, H.² Lajoie, I.³ Bengio, Y.⁴ Manzagol, P.⁵

32
- 84892908371
- A practical guide to training restricted Boltzmann machines
- G. E. Hinton, "A practical guide to training restricted Boltzmann machines, " UTML TR., 2010.
- (2010) UTML TR
- Hinton, G.E.¹

33
- 84867720412
- arXiv:1207.0580
- G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors, " arXiv:1207.0580, 2012.
- (2012) Improving Neural Networks by Preventing Co-adaptation of Feature Detectors
- Hinton, G.E.¹ Srivastava, N.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

34
- 84910031119
- Towards speaker adaptive training of deep neural network acoustic models
- Y. Miao, H. Zhang, and F. Metze, "Towards speaker adaptive training of deep neural network acoustic models, " to appear in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Miao, Y.¹ Zhang, H.² Metze, F.³

35
- 84893668797
- Neighbour selection and adaptation for rapid speaker-dependent ASR
- U. Nallasamy, M. Fuhs, M. Woszczyna, F. Metze, and T. Schultz, "Neighbour selection and adaptation for rapid speaker-dependent ASR, " in Proc. ASRU, pp. 60-65, 2013.
- (2013) Proc. ASRU , pp. 60-65
- Nallasamy, U.¹ Fuhs, M.² Woszczyna, M.³ Metze, F.⁴ Schultz, T.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.