메뉴 건너뛰기




Volumn , Issue , 2013, Pages 6669-6673

A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion

Author keywords

convolution; deep; discrimination; formants; heterogeneous pooling; invariance; neural network

Indexed keywords

CONVOLUTIONAL NEURAL NETWORK; DEEP; DISCRIMINATION; EXPERIMENTAL EVALUATION; FORMANTS; HETEROGENEOUS POOLING; LARGE VOCABULARY SPEECH RECOGNITION; PHONETIC RECOGNITION;

EID: 84890545163     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2013.6638952     Document Type: Conference Paper
Times cited : (171)

References (37)
  • 1
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
    • O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " ICASSP, 2012
    • (2012) ICASSP
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 2
    • 84878397276 scopus 로고    scopus 로고
    • Pipelined back-propagation for context-dependent deep neural networks
    • X. Chen, A. Eversole, G. Li, D. Yu, and F. Seide, "Pipelined back-propagation for context-dependent deep neural networks, " Interspeech, 2012
    • (2012) Interspeech
    • Chen, X.1    Eversole, A.2    Li, G.3    Yu, D.4    Seide, F.5
  • 3
    • 80051616844 scopus 로고    scopus 로고
    • Large vocabulary continuous speech recognition with context-dependent DBN-HMMs
    • G. Dahl, D. Yu, L. Deng. "Large vocabulary continuous speech recognition with context-dependent DBN-HMMs, " ICASSP, 2011
    • (2011) ICASSP
    • Dahl, G.1    Yu, D.2    Deng, L.3
  • 4
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
    • G. Dahl, D. Yu, L. Deng, and A. Acero. "Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. " IEEE Trans. Speech and Audio Proc., vol. 20, no. I, pp. 30-42, 2012
    • (2012) IEEE Trans. Speech and Audio Proc. , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 7
    • 84867614591 scopus 로고    scopus 로고
    • Scalable stacking and learning for building deep architectures
    • L. Deng, D. Yu, and J. Platt. "Scalable stacking and learning for building deep architectures, " ICASSP, 2012
    • (2012) ICASSP
    • Deng, L.1    Yu, D.2    Platt, J.3
  • 9
    • 84890534540 scopus 로고    scopus 로고
    • Use of kernel deep convex networks and end-to-end learning for spoken language understanding
    • L. Deng, G. Tur, X. He, and D. Hakkani-Tur, "Use of kernel deep convex networks and end-to-end learning for spoken language understanding, " IEEE SLT, 2012.
    • (2012) IEEE SLT
    • Deng, L.1    Tur, G.2    He, X.3    Hakkani-Tur, D.4
  • 10
    • 0033623527 scopus 로고    scopus 로고
    • Spontaneous speech recognItIOn USIng a statistical coarticulatory model for the vocal tract resonance dynamics
    • L. Deng and 1. Ma, "Spontaneous speech recognItIOn USIng a statistical coarticulatory model for the vocal tract resonance dynamics, " 1. Acoust.Soc.Am., vol. 108, pp. 3036-3048, 2000
    • (2000) 1. Acoust.Soc.Am. , vol.108 , pp. 3036-3048
    • Deng, L.1    Ma, I.2
  • 11
    • 33744966561 scopus 로고    scopus 로고
    • A bidirectional target filtering model of speech coarticulation: Two-sage implementation for phonetic recognition
    • L. Deng, D. Yu, and A. Acero. "A bidirectional target filtering model of speech coarticulation: Two-sage implementation for phonetic recognition, " IEEE TransactIOns on Audio and Speech Processing, vol. 14, pp. 256-265, 2006
    • (2006) IEEE TransactIOns on Audio and Speech Processing , vol.14 , pp. 256-265
    • Deng, L.1    Yu, D.2    Acero, A.3
  • 13
    • 34547551709 scopus 로고    scopus 로고
    • Use of differential. Cepstra as acousc features in hidden trajectory modelIng for phonetIc recognition
    • L. Deng and D. Yu. "Use of differential. cepstra as acousc features in hidden trajectory modelIng for phonetIc recognition, " ICASSP, 2007
    • (2007) ICASSP
    • Deng, L.1    Yu, D.2
  • 15
    • 84890468916 scopus 로고    scopus 로고
    • Deep learning for speech recognition and related applications
    • L. Deng, D. Yu, and G. Hinton. "Deep Learning for Speech Recognition and Related Applications " NIPS Workshop, 2009 http://nips.cc/Conferences/ 2009IProgramlevent.php?ID= 1512
    • (2009) NIPS Workshop
    • Deng, L.1    Yu, D.2    Hinton, G.3
  • 16
    • 84890526837 scopus 로고    scopus 로고
    • N,: Types of deep neural network learning for speech recognItIOn and related applications: An overview
    • L. Deng, G. Hinton, and B. Kingsbury. "N, types of deep neural network learning for speech recognItIOn and related applications: An overview, " ICASSP, 2013
    • (2013) ICASSP
    • Deng, L.1    Hinton, G.2    Kingsbury, B.3
  • 20
    • 84878539964 scopus 로고    scopus 로고
    • Application of pretrained deep neural networks to large vocabulary speech recognition
    • N. Jaitly, P. Nguyen, and V. Vanhoucke, "Application of pretrained deep neural networks to large vocabulary speech recognition, " Interspeech, 2012
    • (2012) Interspeech
    • Jaitly, N.1    Nguyen, P.2    Vanhoucke, V.3
  • 21
    • 84878379108 scopus 로고    scopus 로고
    • Scalabe minimum Bayes risk training of deep neural network acoustIc models using distributed Hessian-free optimization
    • B. Kingsbury, T. N. Sainath, and H. Soltau. "Scalabe minimum Bayes risk training of deep neural network acoustIc models using distributed Hessian-free optimization, " Interspeech, 2012
    • (2012) Interspeech
    • Kingsbury, B.1    Sainath, T.N.2    Soltau, H.3
  • 22
    • 84876231242 scopus 로고    scopus 로고
    • ImageNet classification with deep convolutional neural networks
    • A. Krizhevsky Ilya Sutskever G. Hinton. "ImageNet classification with deep convolutional neural networks, " NIPS, 2012
    • (2012) NIPS
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.3
  • 24
  • 25
    • 5044231640 scopus 로고    scopus 로고
    • Learning methods for generic object recognition with invariance to pose and lighting
    • Y. LeCun, F. Huang, and L. Bottou, "Learning methods for generic object recognition with invariance to pose and lighting, " Proc. IEEE CVPR, 2004.
    • (2004) Proc. IEEE CVPR
    • Lecun, Y.1    Huang, F.2    Bottou, L.3
  • 27
    • 79959840616 scopus 로고    scopus 로고
    • Investigation of fullsequence training of deep belief networks for speech recognition
    • A. Mohamed, D. Yu, and L. Deng. "Investigation of fullsequence training of deep belief networks for speech recognition, " Interspeech, 2010
    • (2010) Interspeech
    • Mohamed, A.1    Yu, D.2    Deng, L.3
  • 29
    • 84255177123 scopus 로고    scopus 로고
    • Deep and wide: Multiple layers in automatic speech recognition
    • N. Morgan. "Deep and wide: Multiple layers in automatic speech recognition, " IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. I, pp. 7-13, 2012
    • (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 7-13
    • Morgan, N.1
  • 31
    • 0034047363 scopus 로고    scopus 로고
    • Effect of speaking rate and contrastive stress on formant dynamics and vowel perception
    • M. Pitermann, "Effect of speaking rate and contrastive stress on formant dynamics and vowel perception, " J. Acoust. Soc. Am., vol. 107, pp. 3425-3437, 2000
    • (2000) J. Acoust. Soc. Am. , vol.107 , pp. 3425-3437
    • Pitermann, M.1
  • 32
    • 84858972572 scopus 로고    scopus 로고
    • Making deep belief networks effective for large vocabulary continuous speech recognition
    • T. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition ", Proc. ASRU, pp. 30-35, 2011
    • (2011) Proc. ASRU , pp. 30-35
    • Sainath, T.1    Kingsbury, B.2    Ramabhadran, B.3    Fousek, P.4    Novak, P.5    Mohamed, A.6
  • 34
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speec transcription using context-dependent deep neural networks
    • F. Seide, G. Li, and D. Yu, "Conversational speec transcription using context-dependent deep neural networks, Interspeech, 2011
    • (2011) Interspeech
    • Seide, F.1    Li, G.2    Yu, D.3
  • 35
    • 84867605416 scopus 로고    scopus 로고
    • Towars deeper understanding: Deep convex networks for semantIc utterance classification
    • G. Tur, L. Deng, D. Hakkani-Tur, and X. He, "Towars deeper understanding: Deep convex networks for semantIc utterance classification, " ICASSP, 2012
    • (2012) ICASSP
    • Tur, G.1    Deng, L.2    Hakkani-Tur, D.3    He, X.4
  • 36
    • 84055163920 scopus 로고    scopus 로고
    • Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition
    • D. Yu, L. Deng, and G. Dahl, "Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition, " NIPS Workshop, 2010
    • (2010) NIPS Workshop
    • Yu, D.1    Deng, L.2    Dahl, G.3
  • 37
    • 84871387302 scopus 로고    scopus 로고
    • The deep tensor neural network with applications to large vocabulary speech recognition
    • Feb
    • D. Yu, L. Deng, and F. Seide. "The deep tensor neural network with applications to large vocabulary speech recognition, " IEEE Trans. Audio, Speech, and Lang. Proc. vol. 21, no. 2, pp. 388-396, Feb, 2013.
    • (2013) IEEE Trans. Audio, Speech, and Lang. Proc , vol.21 , Issue.2 , pp. 388-396
    • Yu, D.1    Deng, L.2    Seide, F.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.