SCOPUS 정보 검색 플랫폼

Volumn 2015-August, Issue , 2015, Pages 4545-4549

Data augmentation for deep convolutional neural network acoustic modeling

Author keywords

bottleneck features; convolutional neural networks; data augmentation; stochastic feature mapping; vocal tract length perturbation

Indexed keywords

AUDIO SIGNAL PROCESSING; CONVOLUTION; MAPPING; NEURAL NETWORKS; SPEECH COMMUNICATION; STOCHASTIC MODELS; STOCHASTIC SYSTEMS;

BOTTLENECK FEATURES; CONVOLUTIONAL NEURAL NETWORK; DATA AUGMENTATION; STOCHASTIC FEATURES; VOCAL TRACT LENGTHS;

DEEP NEURAL NETWORKS;

EID: 84933584545 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178831 Document Type: Conference Paper

Times cited : (53)

References (15)

1
- 0032203257
- Gradientbased learning applied to document recognition
- Y. LeCun. L. Bottou, Y. Bengio, and P. Haffner, "Gradientbased learning applied to document recognition, " Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998
- (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
- LeCun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

2
- 84945900998
- Best practices for convolutional neural networks applied to visual document analysis
- P. Y. Simard, D. Steinkraus, and J. C. Platt, "Best practices for convolutional neural networks applied to visual document analysis, " in International Conference on Document Analysis and Recognition (ICDAR), 2003, pp. 958-963
- (2003) International Conference on Document Analysis and Recognition (ICDAR) , pp. 958-963
- Simard, P.Y.¹ Steinkraus, D.² Platt, J.C.³

3
- 84893681011
- Vocal tract length perturbation (VTLP) improves speech recognition
- N. laitly and G. E. Hinton, "Vocal tract length perturbation (VTLP) improves speech recognition, " in International Conference on Machine Learning (ICML) Workshop on Deep Learning for Audio, Speech, and Language Processing, 2013
- (2013) International Conference on Machine Learning (ICML) Workshop on Deep Learning for Audio, Speech, and Language Processing
- Laitly, N.¹ Hinton, G.E.²

4
- 84876231242
- ImageNet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks, " in Neural Information Processing Systems (NIPS), 2012, pp. 1106-1114
- (2012) Neural Information Processing Systems (NIPS) , pp. 1106-1114
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

5
- 85009168223
- Noise robustness in speech to speech translation
- F.-H. Liu, Y. Gao, L. Gu, and M. Picheny, "Noise robustness in speech to speech translation, " in Eurospeech, 2003
- (2003) Eurospeech
- Liu, F.-H.¹ Gao, Y.² Gu, L.³ Picheny, M.⁴

6
- 0024631285
- Distance measures for speech recognition
- M. J. Hunt and C. Lefebvre, "Distance measures for speech recognition, " Aeronautical Note, NAE-AN-57, 1989
- (1989) Aeronautical Note , vol.NAE-AN-57
- Hunt, M.J.¹ Lefebvre, C.²

7
- 84905247925
- Data augmentation for deep neural network acoustic modeling
- X. Cui, V. Goel, and B. Kingsbury, "Data augmentation for deep neural network acoustic modeling, " in International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, pp. 5582-5586
- (2014) International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 5582-5586
- Cui, X.¹ Goel, V.² Kingsbury, B.³

8
- 0031647824
- A frequency warping approach to speaker normalization
- L. Lee and R. Rose, "A frequency warping approach to speaker normalization, " IEEE Transactions on Speech and Audio Processing, vol. 6, no. 1, pp. 49-60, 1998
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.1 , pp. 49-60
- Lee, L.¹ Rose, R.²

9
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models, " IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, 1999
- (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
- Gales, M.J.F.¹

10
- 79951796005
- The IBM Attila speech recognition toolkit
- H. Soltau, G. Saon, and B. Kingsbury, "The IBM Attila speech recognition toolkit, " in Spoken Language Technology Workshop (SLT), 2010, pp. 97-101
- (2010) Spoken Language Technology Workshop (SLT) , pp. 97-101
- Soltau, H.¹ Saon, G.² Kingsbury, B.³

11
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, " Computer Speech and Language, vol. 9, pp. l71-185, 1995
- (1995) Computer Speech and Language , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

12
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition, " Computer Speech and Language, vol. 12, pp. 75-98, 1998
- (1998) Computer Speech and Language , vol.12 , pp. 75-98
- Gales, M.J.F.¹

13
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- T. N. Sainath, B. Kingsbury, A. r. Mohamed, G. E. Dahl, G. Saon, H. Soltau, T. Beran, A. Y. Aravkin, and B. Ramabhadran, "Improvements to deep convolutional neural networks for LVCSR, " in Automatic Speech Recognition and Understanding Workshop (ASRU), 2013, pp. 315-320
- (2013) Automatic Speech Recognition and Understanding Workshop (ASRU) , pp. 315-320
- Sainath, T.N.¹ Kingsbury, B.² Mohamed, A.R.³ Dahl, G.E.⁴ Saon, G.⁵ Soltau, H.⁶ Beran, T.⁷ Aravkin, A.Y.⁸ Ramabhadran, B.⁹

14
- 84910071142
- BUT 2014 babel system: Analysis of adaptation in nn based systems
- M. Karafiat, F. Grezl, K. Vesely, M. Hannemann, I. Szoke, and J. Cernocky, "BUT 2014 Babel system: Analysis of adaptation in NN based systems, " in Interspeech, 2014
- (2014) Interspeech
- Karafiat, M.¹ Grezl, F.² Vesely, K.³ Hannemann, M.⁴ Szoke, I.⁵ Cernocky, J.⁶

15
- 84878379108
- Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization
- B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization, " in Interspeech, 2012
- (2012) Interspeech
- Kingsbury, B.¹ Sainath, T.N.² Soltau, H.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.