SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 215-219

Improving deep neural network acoustic models using generalized maxout networks

(4) Zhang, Xiaohui a Trmal, Jan a Povey, Daniel a Khudanpur, Sanjeev a

a Johns Hopkins University (United States)

Author keywords

Acoustic Modeling; Deep Learning; Maxout Networks; Speech Recognition

Indexed keywords

CONTINUOUS SPEECH RECOGNITION; CONTROL NONLINEARITIES; SPEECH RECOGNITION;

ACOUSTIC MODEL; DEEP LEARNING; DEEP NEURAL NETWORKS; LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION; LINEAR UNITS;

SIGNAL PROCESSING;

EID: 84905239342 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6853589 Document Type: Conference Paper

Times cited : (272)

References (29)

1
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

2
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- George E Dahl, Dong Yu, Li Deng, and Alex Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 30-42, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

3
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- Frank Seide, Gang Li, and Dong Yu, "Conversational speech transcription using context-dependent deep neural networks.," in INTERSPEECH, 2011, pp. 437-440.
- (2011) INTERSPEECH , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

4
- 84890471125
- On rectified linear units for speech processing
- MD Zeiler, M Ranzato, R Monga, M Mao, K Yang, QV Le, P Nguyen, A Senior, V Vanhoucke, J Dean, et al., "On rectified linear units for speech processing," in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Zeiler, M.D.¹ Ranzato, M.² Monga, R.³ Mao, M.⁴ Yang, K.⁵ Le, Q.V.⁶ Nguyen, P.⁷ Senior, A.⁸ Vanhoucke, V.⁹ Dean, J.¹⁰

5
- 84905286094
- Rectifier nonlinearities improve neural network acoustic models
- Andrew L Maas, Awni Y Hannun, and Andrew Y Ng, "Rectifier nonlinearities improve neural network acoustic models," in ICML Workshop on Deep Learning for Audio, Speech, and Language Processing (WDLASL 2013), 2013.
- (2013) ICML Workshop on Deep Learning for Audio, Speech, and Language Processing (WDLASL 2013)
- Maas, A.L.¹ Hannun, A.Y.² Ng, A.Y.³

6
- 84892421248
- arXiv preprint arXiv:1302.4389
- Ian J Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, and Yoshua Bengio, "Maxout networks," arXiv preprint arXiv:1302.4389, 2013.
- (2013) Maxout Networks
- Goodfellow, I.J.¹ Warde-Farley, D.² Mirza, M.³ Courville, A.⁴ Bengio, Y.⁵

7
- 84867720412
- arXiv preprint arXiv:1207.0580
- Geoffrey E Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012.
- (2012) Improving Neural Networks by Preventing Co-adaptation of Feature Detectors
- Hinton, G.E.¹ Srivastava, N.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.R.⁵

8
- 84893701756
- Deep maxout networks for low resource speech recognition
- Y. Miao, S. Rawat, and F. Metze, "Deep maxout networks for low resource speech recognition," in Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Miao, Y.¹ Rawat, S.² Metze, F.³

9
- 84893651518
- Deep maxout neural networks for speech recognition
- M. Cai, Y. Shi, and J. Liu, "Deep maxout neural networks for speech recognition," in Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Cai, M.¹ Shi, Y.² Liu, J.³

10
- 70450177775
- Learning invariant features through topographic filter maps
- Koray Kavukcuoglu, M Ranzato, Rob Fergus, and Yann Le-Cun, "Learning invariant features through topographic filter maps," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009, pp. 1605-1612.
- (2009) Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference On. IEEE , pp. 1605-1612
- Kavukcuoglu, K.¹ Ranzato, M.² Fergus, R.³ Le-Cun, Y.⁴

11
- 77956502203
- A theoretical analysis of feature pooling in visual recognition
- Y-Lan Boureau, Jean Ponce, and Yann LeCun, "A theoretical analysis of feature pooling in visual recognition," in Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010, pp. 111-118.
- (2010) Proceedings of the 27th International Conference on Machine Learning (ICML-10) , pp. 111-118
- Boureau, Y.¹ Ponce, J.² Lecun, Y.³

12
- 84874575248
- Convolutional neural networks applied to house numbers digit classification
- Pierre Sermanet, Soumith Chintala, and Yann LeCun, "Convolutional neural networks applied to house numbers digit classification," in Pattern Recognition (ICPR), 2012 21st International Conference on. IEEE, 2012, pp. 3288-3291.
- (2012) Pattern Recognition (ICPR), 2012 21st International Conference On. IEEE , pp. 3288-3291
- Sermanet, P.¹ Chintala, S.² Lecun, Y.³

13
- 84905275608
- arXiv preprint arXiv:1311.1780
- Caglar Gulcehre, Kyunghyun Cho, Razvan Pascanu, and Yoshua Bengio, "Learned-norm pooling for deep neural networks," arXiv preprint arXiv:1311.1780, 2013.
- (2013) Learned-norm Pooling for Deep Neural Networks
- Gulcehre, C.¹ Cho, K.² Pascanu, R.³ Bengio, Y.⁴

14
- 84858953642
- The kaldi speech recognition toolkit
- D. Povey, A. Ghoshal, et al., "The Kaldi Speech Recognition Toolkit," in Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Povey, D.¹ Ghoshal, A.²

15
- 84906274730
- Sequence-discriminative training of deep neural networks
- Karel Veselỳ, Arnab Ghoshal, Lukás Burget, and Daniel Povey, "Sequence-discriminative training of deep neural networks," in Interspeech, 2013.
- (2013) Interspeech
- Veselỳ, K.¹ Ghoshal, A.² Burget, L.³ Povey, D.⁴

16
- 51449120120
- Boosted MMI for feature and model space discriminative training
- D. Povey and D. Kanevsky and B. Kingsbury and B. Ramabhadran and G. Saon and K. Visweswariah, "Boosted MMI for Feature and Model Space Discriminative Training," in ICASSP, 2008.
- (2008) ICASSP
- Povey, D.¹ Kanevsky, D.² Kingsbury, B.³ Ramabhadran, B.⁴ Saon, G.⁵ Visweswariah, K.⁶

17
- 44949182698
- Hypothesis spaces for minimum bayes risk training in large vocabulary speech recognition
- Gibson M. and Hain T., "Hypothesis Spaces For Minimum Bayes Risk Training In Large Vocabulary Speech Recognition," in Interspeech, 2006.
- (2006) Interspeech
- Gibson, M.¹ Hain, T.²

18
- 34547529083
- Evaluation of proposed modifications to MPE for large scale discriminative training
- Daniel Povey and Brian Kingsbury, "Evaluation of proposed modifications to MPE for large scale discriminative training," in ICASSP, 2007.
- (2007) ICASSP
- Povey, D.¹ Kingsbury, B.²

19
- 84864073449
- Greedy layer-wise training of deep networks
- Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle, "Greedy layer-wise training of deep networks," Advances in neural information processing systems (NIPS), vol. 19, pp. 153, 2007.
- (2007) Advances in Neural Information Processing Systems (NIPS) , vol.19 , pp. 153
- Bengio, Y.¹ Lamblin, P.² Popovici, D.³ Larochelle, H.⁴

20
- 85162467517
- arXiv preprint arXiv:1106.5730
- Feng Niu, Benjamin Recht, Christopher Ré, and Stephen J Wright, "Hogwild!: A lock-free approach to parallelizing stochastic gradient descent," arXiv preprint arXiv:1106.5730, 2011.
- (2011) Hogwild!: A Lock-free Approach to Parallelizing Stochastic Gradient Descent
- Niu, F.¹ Recht, B.² Ré, C.³ Wright, S.J.⁴

21
- 84877760312
- Large scale distributed deep networks
- Jeffrey Dean and Greg S. Corrado and Rajat Monga and Kai Chen and Matthieu Devin and Quoc V. Le and Mark Z. Mao and Marc'Aurelio Ranzato and Andrew Senior and Paul Tucker and Ke Yang and Andrew Y. Ng, "Large Scale Distributed Deep Networks," in Neural Information Processing Systems (NIPS), 2012.
- (2012) Neural Information Processing Systems (NIPS)
- Dean, J.¹ Corrado, G.S.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Le, Q.V.⁶ Mao, M.Z.⁷ Ranzato, M.⁸ Senior, A.⁹ Tucker, P.¹⁰ Yang, K.¹¹ Ng, A.Y.¹²

22
- 0032629928
- Statistical analysis of learning dynamics
- Noboru Murata and Shun-ichi Amari, "Statistical analysis of learning dynamics," Signal Processing, vol. 74, no. 1, pp. 3-28, 1999.
- (1999) Signal Processing , vol.74 , Issue.1 , pp. 3-28
- Murata, N.¹ Amari, S.²

23
- 78650085692
- Adaptive online gradient descent
- Elad Hazan, Alexander Rakhlin, and Peter L Bartlett, "Adaptive online gradient descent," in Advances in Neural Information Processing Systems (NIPS), 2007, pp. 65-72.
- (2007) Advances in Neural Information Processing Systems (NIPS) , pp. 65-72
- Hazan, E.¹ Rakhlin, A.² Bartlett, P.L.³

24
- 84887388950
- An empirical study of learning rates in deep neural networks for speech recognition
- Andrew Senior, Georg Heigold, Marc'Aurelio Ranzato, and Ke Yang, "An empirical study of learning rates in deep neural networks for speech recognition," in Acoustics, Speech and Signal Processing (ICASSP), 2013.
- (2013) Acoustics, Speech and Signal Processing (ICASSP)
- Senior, A.¹ Heigold, G.² Ranzato, M.³ Yang, K.⁴

25
- 84862294866
- Deep sparse rectifier networks
- Xavier Glorot, Antoine Bordes, and Yoshua Bengio, "Deep sparse rectifier networks," in Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume, 2011, vol. 15, pp. 315-323.
- (2011) Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP , vol.15 , pp. 315-323
- Glorot, X.¹ Bordes, A.² Bengio, Y.³

26
- 78049502526
- The subspace gaussian mixture model-a structured model for speech recognition
- April
- D. Povey, L. Burget, et al., "The Subspace Gaussian Mixture Model-A Structured Model for Speech Recognition," Computer Speech & Language, vol. 25, no. 2, pp. 404-439, April 2011.
- (2011) Computer Speech & Language , vol.25 , Issue.2 , pp. 404-439
- Povey, D.¹ Burget, L.²

27
- 0030263447
- Mean and variance adaptation within the MLLR framework
- M. J. F. Gales and P. C.Woodland, "Mean and Variance Adaptation Within the MLLR Framework," Computer Speech and Language, vol. 10, pp. 249-264, 1996.
- (1996) Computer Speech and Language , vol.10 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

28
- 84878410974
- Noise robust pitch tracking by subband autocorrelation classification
- Daniel PWEllis and Byunk Suk Lee, "Noise robust pitch tracking by subband autocorrelation classification," in 13th Annual Conference of the International Speech Communication Association, 2012.
- (2012) 13th Annual Conference of the International Speech Communication Association
- Pwellis, D.¹ Suk Lee, B.²

29
- 84893672075
- The fundamental frequency variation spectrum
- Kornel Laskowski, Mattias Heldner, and Jens Edlund, "The fundamental frequency variation spectrum," in FONETIK, 2008.
- (2008) FONETIK
- Laskowski, K.¹ Heldner, M.² Edlund, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.