SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 6669-6673

A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion

(3) Deng, Li a Abdel Hamid, Ossama b Yu, Dong a

a MICROSOFT RESEARCH (United States)

b YORK UNIVERSITY (Canada)

Author keywords

convolution; deep; discrimination; formants; heterogeneous pooling; invariance; neural network

Indexed keywords

CONVOLUTIONAL NEURAL NETWORK; DEEP; DISCRIMINATION; EXPERIMENTAL EVALUATION; FORMANTS; HETEROGENEOUS POOLING; LARGE VOCABULARY SPEECH RECOGNITION; PHONETIC RECOGNITION;

CONTINUOUS SPEECH RECOGNITION; CONVOLUTION; INVARIANCE; NETWORK ARCHITECTURE; SIGNAL PROCESSING;

NEURAL NETWORKS;

EID: 84890545163 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6638952 Document Type: Conference Paper

Times cited : (171)

References (37)

1
- 84867605836
- Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " ICASSP, 2012
- (2012) ICASSP
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

2
- 84878397276
- Pipelined back-propagation for context-dependent deep neural networks
- X. Chen, A. Eversole, G. Li, D. Yu, and F. Seide, "Pipelined back-propagation for context-dependent deep neural networks, " Interspeech, 2012
- (2012) Interspeech
- Chen, X.¹ Eversole, A.² Li, G.³ Yu, D.⁴ Seide, F.⁵

3
- 80051616844
- Large vocabulary continuous speech recognition with context-dependent DBN-HMMs
- G. Dahl, D. Yu, L. Deng. "Large vocabulary continuous speech recognition with context-dependent DBN-HMMs, " ICASSP, 2011
- (2011) ICASSP
- Dahl, G.¹ Yu, D.² Deng, L.³

4
- 84055222005
- Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
- G. Dahl, D. Yu, L. Deng, and A. Acero. "Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. " IEEE Trans. Speech and Audio Proc., vol. 20, no. I, pp. 30-42, 2012
- (2012) IEEE Trans. Speech and Audio Proc. , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

5
- 84877760312
- Large scaled distributed deep networks
- J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. Le, M. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Ng. "Large scaled distributed deep networks, " IPS, 2012
- (2012) IPS
- Dean, J.¹ Corrado, G.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Le, Q.⁶ Mao, M.⁷ Ranzato, M.⁸ Senior, A.⁹ Tucker, P.¹⁰ Yang, K.¹¹ Ng, A.¹²

6
- 84890491198
- Recent advances of deep learning for speech research at Microsoft
- L. Deng, J. Li, J. Huang, K. Yao, D. Yu, F. SeIde, M. Seltzer, G. Zweig, X. He, J. Williams, Y. Gong, and A. Acero. "Recent advances of deep learning for speech research at Microsoft, " ICASSP, 2013
- (2013) ICASSP
- Deng, L.¹ Li, J.² Huang, J.³ Yao, K.⁴ Yu, D.⁵ Seide, F.⁶ Seltzer, G.⁷ Weig, Z.⁸ He, X.⁹ Williams, J.¹⁰ Gong, Y.¹¹ Acero, A.¹²

7
- 84867614591
- Scalable stacking and learning for building deep architectures
- L. Deng, D. Yu, and J. Platt. "Scalable stacking and learning for building deep architectures, " ICASSP, 2012
- (2012) ICASSP
- Deng, L.¹ Yu, D.² Platt, J.³

8
- 79959842828
- Binary coding of speech spectrograms USIng a deep auto-encoder
- L. Deng, M. Seltzer, D. Yu, A. Acero, A. Mohad, and G. Hinton, "Binary coding of speech spectrograms USIng a deep auto-encoder, " Interspeech, 2010
- (2010) Interspeech
- Deng, L.¹ Seltzer, M.² Yu, D.³ Acero, A.⁴ Mohamd, A.⁵ Hinton, G.⁶

9
- 84890534540
- Use of kernel deep convex networks and end-to-end learning for spoken language understanding
- L. Deng, G. Tur, X. He, and D. Hakkani-Tur, "Use of kernel deep convex networks and end-to-end learning for spoken language understanding, " IEEE SLT, 2012.
- (2012) IEEE SLT
- Deng, L.¹ Tur, G.² He, X.³ Hakkani-Tur, D.⁴

10
- 0033623527
- Spontaneous speech recognItIOn USIng a statistical coarticulatory model for the vocal tract resonance dynamics
- L. Deng and 1. Ma, "Spontaneous speech recognItIOn USIng a statistical coarticulatory model for the vocal tract resonance dynamics, " 1. Acoust.Soc.Am., vol. 108, pp. 3036-3048, 2000
- (2000) 1. Acoust.Soc.Am. , vol.108 , pp. 3036-3048
- Deng, L.¹ Ma, I.²

11
- 33744966561
- A bidirectional target filtering model of speech coarticulation: Two-sage implementation for phonetic recognition
- L. Deng, D. Yu, and A. Acero. "A bidirectional target filtering model of speech coarticulation: Two-sage implementation for phonetic recognition, " IEEE TransactIOns on Audio and Speech Processing, vol. 14, pp. 256-265, 2006
- (2006) IEEE TransactIOns on Audio and Speech Processing , vol.14 , pp. 256-265
- Deng, L.¹ Yu, D.² Acero, A.³

12
- 34047266395
- Structured speech mo. Deling
- L. Deng, D. Yu, and A. Acero. "Structured speech mo. deling, " IEEE Trans. on Audio, Speech and Language ProcessIng, vol. 14, no. 5, pp. 1492-1504, 2006.
- (2006) IEEE Trans. on Audio, Speech and Language ProcessIng , vol.14 , Issue.5 , pp. 1492-1504
- Deng, L.¹ Yu, D.² Acero, A.³

13
- 34547551709
- Use of differential. Cepstra as acousc features in hidden trajectory modelIng for phonetIc recognition
- L. Deng and D. Yu. "Use of differential. cepstra as acousc features in hidden trajectory modelIng for phonetIc recognition, " ICASSP, 2007
- (2007) ICASSP
- Deng, L.¹ Yu, D.²

14
- 4243117872
- PublIsher: Marcel Dekker Inc., June
- L. Deng and D. O'Shaughnessy, SPEECH PROCESSIN. G-A Dynamic and Optimization-Oriented Approach, PublIsher: Marcel Dekker Inc., June 2003
- (2003) SPEECH PROCESSIN. G-A Dynamic and Optimization-Oriented Approach
- Deng, L.¹ O'Shaughnessy, D.²

15
- 84890468916
- Deep learning for speech recognition and related applications
- L. Deng, D. Yu, and G. Hinton. "Deep Learning for Speech Recognition and Related Applications " NIPS Workshop, 2009 http://nips.cc/Conferences/ 2009IProgramlevent.php?ID= 1512
- (2009) NIPS Workshop
- Deng, L.¹ Yu, D.² Hinton, G.³

16
- 84890526837
- N,: Types of deep neural network learning for speech recognItIOn and related applications: An overview
- L. Deng, G. Hinton, and B. Kingsbury. "N, types of deep neural network learning for speech recognItIOn and related applications: An overview, " ICASSP, 2013
- (2013) ICASSP
- Deng, L.¹ Hinton, G.² Kingsbury, B.³

17
- 85032751458
- Deep neural networks for acoustIc modelIng in speech recognition
- Nov
- G. Hinton, L. Deng, D. Yu, G. Dahl, A.-R. Moamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. aInath, nd B. Kingsbury. "Deep neural networks for acoustIc modelIng in speech recognition, " IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 82-97, Nov., 2012
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Moamed, A.-R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Ainath, T.¹⁰ Kingsbury, B.¹¹

18
- 84867720412
- 0580 arXiv
- G. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever: &R. Salakhutdinov. "Improving neural networks by preventIng coadaptation of feature detectors, " arXiv: 1207.0580vl, 2012
- (2012) Improving Neural Networks by PreventIng Coadaptation of Feature Detectors 1207
- Hinton, G.¹ Srivastava, N.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

19
- 84879301618
- Tensor deep stacking networks
- to appear
- B. Hutchinson, L. Deng, and D. Yu, "Tensor deep stacking networks, " IEEE Trans. Pattern Analysis and Machine Intelligence (special issue of Learning Deep Architectures), 2013, to appear
- (2013) IEEE Trans. Pattern Analysis and Machine Intelligence (Special Issue of Learning Deep Architectures)
- Hutchinson, B.¹ Deng, L.² Yu, D.³

20
- 84878539964
- Application of pretrained deep neural networks to large vocabulary speech recognition
- N. Jaitly, P. Nguyen, and V. Vanhoucke, "Application of pretrained deep neural networks to large vocabulary speech recognition, " Interspeech, 2012
- (2012) Interspeech
- Jaitly, N.¹ Nguyen, P.² Vanhoucke, V.³

21
- 84878379108
- Scalabe minimum Bayes risk training of deep neural network acoustIc models using distributed Hessian-free optimization
- B. Kingsbury, T. N. Sainath, and H. Soltau. "Scalabe minimum Bayes risk training of deep neural network acoustIc models using distributed Hessian-free optimization, " Interspeech, 2012
- (2012) Interspeech
- Kingsbury, B.¹ Sainath, T.N.² Soltau, H.³

22
- 84876231242
- ImageNet classification with deep convolutional neural networks
- A. Krizhevsky Ilya Sutskever G. Hinton. "ImageNet classification with deep convolutional neural networks, " NIPS, 2012
- (2012) NIPS
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.³

23
- 85161972005
- Tiled convolutional neural networks
- Q. Le, J. Ngiam, Z. Chen, D. Chia, W. Pang, and A. Ng. "Tiled convolutional neural networks, " NIPS, 2010
- (2010) NIPS
- Le, Q.¹ Ngiam, J.² Chen, Z.³ Chia, D.⁴ Pang, W.⁵ Ng, A.⁶

24
- 0032203257
- Gradintbased learning applied to document recognition
- Y. Lecun, L. Bottou, Y. Bengio, and P. ffn, r. " Gradintbased learning applied to document recognItIOn, ProceedIngs of the IEEE, pp. 2278-2324, 1998
- (1998) ProceedIngs of the IEEE , pp. 2278-2324
- Lecun, Y.¹ Bottou, L.² Bengio, Y.³ Ffnr, P.⁴

25
- 5044231640
- Learning methods for generic object recognition with invariance to pose and lighting
- Y. LeCun, F. Huang, and L. Bottou, "Learning methods for generic object recognition with invariance to pose and lighting, " Proc. IEEE CVPR, 2004.
- (2004) Proc. IEEE CVPR
- Lecun, Y.¹ Huang, F.² Bottou, L.³

26
- 84055211743
- Acousti modelIng using deep belief networks
- A. Mohamed, G. Dahl, and G. Hinton, "Acoust modelIng using deep belief networks, " IEEE Trans. on AudIO, Speech, and Language Processing, " Vol. 20, no. I, pp. 14-22, 2012
- (2012) IEEE Trans. on AudIO, Speech, and Language Processing , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

27
- 79959840616
- Investigation of fullsequence training of deep belief networks for speech recognition
- A. Mohamed, D. Yu, and L. Deng. "Investigation of fullsequence training of deep belief networks for speech recognition, " Interspeech, 2010
- (2010) Interspeech
- Mohamed, A.¹ Yu, D.² Deng, L.³

28
- 80051654263
- Deep belief nets USIng dlscnmInatlve features for phone recognition
- A. Mohamed, T. Sainath, G. Dahl, B. ambhdrn,. G. Hinton, M. Picheny. "Deep belief nets USIng dlscnmInatlve features for phone recognition, " ICASSP, 2011
- (2011) ICASSP
- Mohamed, A.¹ Sainath, T.² Dahl, G.³ Ambhdrn, B.⁴ Hinton, G.⁵ Picheny, M.⁶

29
- 84255177123
- Deep and wide: Multiple layers in automatic speech recognition
- N. Morgan. "Deep and wide: Multiple layers in automatic speech recognition, " IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. I, pp. 7-13, 2012
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 7-13
- Morgan, N.¹

30
- 80053437179
- Multi modal deep learning
- J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Ng, "Multi modal deep learning, " ICML, 20 II
- (2011) ICML
- Ngiam, J.¹ Khosla, A.² Kim, M.³ Nam, J.⁴ Lee, H.⁵ Ng, A.⁶

31
- 0034047363
- Effect of speaking rate and contrastive stress on formant dynamics and vowel perception
- M. Pitermann, "Effect of speaking rate and contrastive stress on formant dynamics and vowel perception, " J. Acoust. Soc. Am., vol. 107, pp. 3425-3437, 2000
- (2000) J. Acoust. Soc. Am. , vol.107 , pp. 3425-3437
- Pitermann, M.¹

32
- 84858972572
- Making deep belief networks effective for large vocabulary continuous speech recognition
- T. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition ", Proc. ASRU, pp. 30-35, 2011
- (2011) Proc. ASRU , pp. 30-35
- Sainath, T.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, P.⁵ Mohamed, A.⁶

33
- 84878572738
- Enhancing exemplar-based posteriors for speech recognItIOn tasks
- T. Sainath, D. Nahamoo, D. Kanevsky, B. Ramabhar, "Enhancing exemplar-based posteriors for speech recognItIOn tasks, " Interspeech, 2012
- (2012) Interspeech
- Sainath, T.¹ Nahamoo, D.² Kanevsky, D.³ Ramabhar, B.⁴

34
- 84865801985
- Conversational speec transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational speec transcription using context-dependent deep neural networks, Interspeech, 2011
- (2011) Interspeech
- Seide, F.¹ Li, G.² Yu, D.³

35
- 84867605416
- Towars deeper understanding: Deep convex networks for semantIc utterance classification
- G. Tur, L. Deng, D. Hakkani-Tur, and X. He, "Towars deeper understanding: Deep convex networks for semantIc utterance classification, " ICASSP, 2012
- (2012) ICASSP
- Tur, G.¹ Deng, L.² Hakkani-Tur, D.³ He, X.⁴

36
- 84055163920
- Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition
- D. Yu, L. Deng, and G. Dahl, "Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition, " NIPS Workshop, 2010
- (2010) NIPS Workshop
- Yu, D.¹ Deng, L.² Dahl, G.³

37
- 84871387302
- The deep tensor neural network with applications to large vocabulary speech recognition
- Feb
- D. Yu, L. Deng, and F. Seide. "The deep tensor neural network with applications to large vocabulary speech recognition, " IEEE Trans. Audio, Speech, and Lang. Proc. vol. 21, no. 2, pp. 388-396, Feb, 2013.
- (2013) IEEE Trans. Audio, Speech, and Lang. Proc , vol.21 , Issue.2 , pp. 388-396
- Yu, D.¹ Deng, L.² Seide, F.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.