SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 6844-6848

Sequence classification using the high-level features extracted from deep neural networks

(2) Deng, Li a Chen, Jianshu b

a MICROSOFT RESEARCH (United States)

b UNIVERSITY OF CALIFORNIA (United States)

Author keywords

ARMA recurrent neural net; deep neural net; feature extraction; phone recognition

Indexed keywords

FEATURE EXTRACTION; RECURRENT NEURAL NETWORKS; SPEECH RECOGNITION;

ACOUSTIC DATA; DEEP NEURAL NETWORKS; FEATURE VECTORS; HIDDEN LAYERS; HIGH-LEVEL FEATURES; NEURAL NET; PHONE RECOGNITION; SEQUENCE CLASSIFICATION;

SIGNAL PROCESSING;

EID: 84905280906 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854926 Document Type: Conference Paper

Times cited : (44)

References (37)

1
- 84890543516
- Advances in optimizing recurrent networks
- Bengio, Y., Boulanger-Lewandowski, N., and Pascanu, R. "Advances in optimizing recurrent networks," Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Bengio, Y.¹ Boulanger-Lewandowski, N.² Pascanu, R.³

2
- 0003573244
- Bourlard H. and Morgan, N.. Connectionist Speech Recognition: A Hybrid Approach, Kluwer, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach, Kluwer
- Bourlard, H.¹ Morgan, N.²

3
- 85083950550
- A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property
- Chen, J. and Deng, L. "A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property", Proc. ICLR, 2014.
- (2014) Proc. ICLR
- Chen, J.¹ Deng, L.²

4
- 80051616844
- Large vocabulary continuous speech recognition with context-dependent DBNHMMs
- Dahl, G., Yu, D., Deng, L., and Acero, A. "Large vocabulary continuous speech recognition with context-dependent DBNHMMs," Proc. ICASSP, 2011.
- (2011) Proc. ICASSP
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

5
- 84055222005
- Contextdependent pre-trained deep neural networks for largevocabulary speech recognition
- Dahl, G., Yu, D., Deng, L., and Acero, A. "Contextdependent pre-trained deep neural networks for largevocabulary speech recognition," IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, pp. 30-42, 2012.
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

6
- 84876672166
- Machine learning paradigms in speech recognition: An overview
- May
- Deng, L. and Li, X. "Machine learning paradigms in speech recognition: An overview," IEEE Transactions on Audio, Speech, &Language, vol. 21, pp. 1060-1089, May 2013.
- (2013) IEEE Transactions on Audio, Speech, &Language , vol.21 , pp. 1060-1089
- Deng, L.¹ Li, X.²

7
- 84890545163
- A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
- Deng, L., Abdel-Hamid, O., and Yu, D. "A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion," Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Deng, L.¹ Abdel-Hamid, O.² Yu, D.³

8
- 84890526837
- New types of deep neural network learning for speech recognition and related applications: An overview
- Deng, L., Hinton, G., and Kingsbury, B. "New types of deep neural network learning for speech recognition and related applications: An overview," Proc. ICASSP, 2013a.
- (2013) Proc. ICASSP
- Deng, L.¹ Hinton, G.² Kingsbury, B.³

9
- 84890491198
- Recent advances in deep learning for speech research at Microsoft
- Deng, L., Li, J., Huang, K., Yao, D. Yu, F. Seide, M. Seltzer, G. Zweig, X. He, J. Williams, Y. Gong, and A. Acero. "Recent advances in deep learning for speech research at Microsoft," Proc. ICASSP, 2013b.
- (2013) Proc. ICASSP
- Deng, L.¹ Li, J.² Huang, K.³ Yao, Y.D.⁴ Seide, F.⁵ Seltzer, M.⁶ Zweig, G.⁷ He, X.⁸ Williams, J.⁹ Gong, Y.¹⁰ Acero, A.¹¹

10
- 84867614591
- Scalable stacking and learning for building deep architectures
- Deng, L., Yu, D., and Platt, J. "Scalable stacking and learning for building deep architectures," Proc. ICASSP 2012.
- (2012) Proc. ICASSP
- Deng, L.¹ Yu, D.² Platt, J.³

11
- 84865768819
- Deep Convex Network: A scalable architecture for speech pattern classification
- Deng, L. and Yu, D. "Deep Convex Network: A scalable architecture for speech pattern classification," Proc. Interspeech, 2011.
- (2011) Proc. Interspeech
- Deng, L.¹ Yu, D.²

12
- 79959842828
- Binary coding of speech spectrograms using a deep auto-encoder
- Deng, L., Seltzer, M., Yu, D., Acero, A., Mohamed, A., and Hinton, G. "Binary coding of speech spectrograms using a deep auto-encoder," Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Deng, L.¹ Seltzer, M.² Yu, D.³ Acero, A.⁴ Mohamed, A.⁵ Hinton, G.⁶

13
- 4243117872
- Marcel Dekker
- Deng, L. and O'Shaughnessy, D. SPEECH PROCESSING-A Dynamic and Optimization-Oriented Approach, Marcel Dekker, 2003.
- (2003) SPEECH PROCESSING-A Dynamic and Optimization-Oriented Approach
- Deng, L.¹ O'Shaughnessy, D.²

14
- 84890543083
- Speech recognition with deep recurrent neural networks
- Graves, A., Mohamed, A., and Hinton, G. "Speech recognition with deep recurrent neural networks," Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Graves, A.¹ Mohamed, A.² Hinton, G.³

15
- 84893701254
- Hybrid speech recognition with deep bidirectional LSTM
- Graves, A., Jaitly, N., and Mohamed, A. "Hybrid speech recognition with deep bidirectional LSTM," Proc. ASRU, 2013a.
- (2013) Proc. ASRU
- Graves, A.¹ Jaitly, N.² Mohamed, A.³

16
- 34547548235
- Probabilistic and bottleneck features for LVCSR of meetings
- Grezl, F. Karafiat, M., Kontar, S., and Cernocky, J. "Probabilistic and bottleneck features for LVCSR of meetings," Proc. ICASSP, 2007.
- (2007) Proc. ICASSP
- Grezl, F.¹ Karafiat, M.² Kontar, S.³ Cernocky, J.⁴

17
- 85008035419
- Equivalence of generative and log-liner models
- February
- Heigold, G., Ney, H., Lehnen, P., Gass, T., Schluter, R. "Equivalence of generative and log-liner models," IEEE Trans. Audio, Speech, and Language Proc., vol. 19, February 2011, pp. 1138-1148.
- (2011) IEEE Trans. Audio, Speech, and Language Proc. , vol.19 , pp. 1138-1148
- Heigold, G.¹ Ney, H.² Lehnen, P.³ Gass, T.⁴ Schluter, R.⁵

18
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- Hermansky, H, Ellis, D. and Sharma, S. "Tandem connectionist feature extraction for conventional HMM systems," Proc. ICASSP, 2000.
- (2000) Proc. ICASSP
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

19
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- November
- Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B. "Deep neural networks for acoustic modeling in speech recognition" IEEE Signal Processing Magazine, vol. 29, November 2012, pp. 82-97.
- (2012) IEEE Signal Processing Magazine , vol.29 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Kingsbury, B.¹¹

20
- 84879301618
- Tensor deep stacking networks
- Hutchinson, B., Deng, L., and Yu, D. "Tensor deep stacking networks," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 35, 2013, pp. 1944-1957.
- (2013) IEEE Trans. Pattern Analysis and Machine Intelligence , vol.35 , pp. 1944-1957
- Hutchinson, B.¹ Deng, L.² Yu, D.³

21
- 84878539964
- Application of pre-trained deep neural networks to large vocabulary speech recognition
- Jaitly, N., Nguyen, P., Senior, A. and V. Vanhoucke, "Application of pre-trained deep neural networks to large vocabulary speech recognition," in Proc. InterSpeech-2012
- Proc. InterSpeech-2012
- Jaitly, N.¹ Nguyen, P.² Senior, A.³ Vanhoucke, V.⁴

22
- 84878379108
- Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization
- Kingsbury, B., Sainath, T., and Soltau, H. "Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization," Proc. Interspeech, 2012.
- (2012) Proc. Interspeech
- Kingsbury, B.¹ Sainath, T.² Soltau, H.³

23
- 84055211743
- Acoustic modeling using deep belief networks
- January
- Mohamed, A., Dahl, G. and Hinton, G. "Acoustic modeling using deep belief networks", IEEE Transactions Audio, Speech, &Language Proc. Vol. 20. No. 1, January 2012.
- (2012) IEEE Transactions Audio, Speech, &Language Proc , vol.20 , Issue.1
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

24
- 79959840616
- Investigation of fullsequence training of deep belief networks for speech recognition
- Mohamed A., Yu, D., and Deng, L. "Investigation of fullsequence training of deep belief networks for speech recognition," Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Mohamed, A.¹ Yu, D.² Deng, L.³

25
- 84255177123
- Deep and wide: Multiple layers in automatic speech recognition
- January
- Morgan, N. "Deep and wide: Multiple layers in automatic speech recognition," IEEE Transactions on Audio, Speech, and Language Proc., vol. 20, January 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Proc. , vol.20
- Morgan, N.¹

26
- 84897497795
- On the difficulty of training recurrent neural networks
- Pascanu, R., Mikolov, T., and Bengio, Y. "On the difficulty of training recurrent neural networks," Proc. ICML, 2013.
- (2013) Proc. ICML
- Pascanu, R.¹ Mikolov, T.² Bengio, Y.³

27
- 84867593213
- Auto-encoder bottleneck features using deep belief networks
- Sainath, B. Kingsbury, and B. Ramabhadran, "Auto-encoder bottleneck features using deep belief networks," Proc. ICASSP, 2012
- (2012) Proc. ICASSP
- Sainath, B.K.¹ Ramabhadran, B.²

28
- 84886829539
- Optimization techniques to improve training speed of deep neural networks for large speech tasks
- Nov.
- Sainath, T., Kingsbury, B., Soltau, H., and Ramabhadran, B., "Optimization techniques to improve training speed of deep neural networks for large speech tasks, IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, Nov. 2013, pp. 2267-2276.
- (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , pp. 2267-2276
- Sainath, T.¹ Kingsbury, B.² Soltau, H.³ Ramabhadran, B.⁴

29
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- Seide, F., Li, G., and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," Proc. Interspeech, 2011.
- (2011) Proc. Interspeech
- Seide, F.¹ Li, G.² Yu, D.³

30
- 84890543852
- Error back propagation for sequence training of context-dependent deep neural networks for conversational speech transcription
- Su, H., Li, G., Yu, D., and Seide, F. "Error back propagation for sequence training of context-dependent deep neural networks for conversational speech transcription," Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Su, H.¹ Li, G.² Yu, D.³ Seide, F.⁴

31
- 84878403164
- Context dependent MLPs for LVCSR: Tandem, hybrid or both
- Tuske, Z. Sundermeyer, M. Schluter, R. and H. Ney, "Context dependent MLPs for LVCSR: Tandem, hybrid or both?" Proc. Interspeech, 2012.
- (2012) Proc. Interspeech
- Tuske, Z.¹ Sundermeyer, M.² Schluter, R.³ Ney, H.⁴

32
- 84905262984
- Sequencediscriminative training of deep neural networks
- Vesely, K, Ghoshal, A., Burget, L., Povey, D. "Sequencediscriminative training of deep neural networks," Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Vesely, K.¹ Ghoshal, A.² Burget, L.³ Povey, D.⁴

33
- 84877777313
- Learning with recursive perceptual representations
- Vinyals, O., Jia, Y., Deng, L., and Darrell, T. "Learning with recursive perceptual representations," Proc. Neural Information Processing Systems (NIPS), vol. 15, 2012.
- (2012) Proc. Neural Information Processing Systems (NIPS) , vol.15
- Vinyals, O.¹ Jia, Y.² Deng, L.³ Darrell, T.⁴

34
- 84887037596
- Optimization algorithms and applications for speech and language processing
- Wright, S., Kanevsky, D., Deng, L., He, X., Heigold, G., Li, H. "Optimization Algorithms and Applications for Speech and Language Processing," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 11, pp. 2231-2243, 2013.
- (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.11 , pp. 2231-2243
- Wright, S.¹ Kanevsky, D.² Deng, L.³ He, X.⁴ Heigold, G.⁵ Li, H.⁶

35
- 84906225757
- A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
- Yan, Z., Huo, Q, and Xu, J. "A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR," Proc. Interspeech, 2013.
- (2013) Proc. Interspeech
- Yan, Z.¹ Huo, Q.² Xu, J.³

36
- 84871387302
- The deep tensor neural network with applications to large vocabulary speech recognition
- Yu, D., Deng, L., and Seide, F. "The deep tensor neural network with applications to large vocabulary speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 2, pp. 388-396, 2013.
- (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.2 , pp. 388-396
- Yu, D.¹ Deng, L.² Seide, F.³

37
- 84055163920
- Roles of pre-training and fine-tuning in context-dependent dbn-HMMS for real-world speech recognition
- Yu, D., Deng, L., and Dahl, G. "Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition," Proc. NIPS Workshop, 2010.
- (2010) Proc. NIPS Workshop
- Yu, D.¹ Deng, L.² Dahl, G.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.