SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 1888-1892

Time-frequency kernel-based CNN for speech recognition

(3) Zhao, Tuo a Zhao, Yunxin a Chen, Xin b

a UNIVERSITY OF MISSOURI (United States)

b Pearson Knowledge Technologies (United States)

Author keywords

Convolutional neural network; Robust speech recognition; Time frequency kernels

Indexed keywords

CONVOLUTION; FREQUENCY DOMAIN ANALYSIS; NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION; TELEPHONE SETS;

CONVOLUTIONAL NEURAL NETWORK; DIFFERENT TREATMENTS; FREQUENCY DOMAINS; KERNEL APPROACHES; PHONE RECOGNITION; ROBUST SPEECH RECOGNITION; SPEAKER INDEPENDENTS; TIME FREQUENCY KERNELS;

SPEECH RECOGNITION;

EID: 84959087712 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (12)

References (20)

1
- 84055211743
- Acoustic modeling using deep belief networks
- Mohamed, A., Dahl, G. E. and Hinton, G., "Acoustic Modeling Using Deep Belief Networks", IEEE Trans. Audio, Speech, Lang. Proc., 20 (1): 14-22, 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Proc. , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

2
- 84890478854
- Multiframe deep neural networks for acoustic modeling
- Vanhoucke, V., Devin, M. and Heigold, G., "Multiframe deep neural networks for acoustic modeling", in Proc. ICASSP, 7582-7585, 2013.
- (2013) Proc. ICASSP , pp. 7582-7585
- Vanhoucke, V.¹ Devin, M.² Heigold, G.³

3
- 84892184434
- Perceptual processing of speech and other perceptual patterns: Some similarities and differences
- Greenberg S. and Ainsworth, W., Ed, Oxford University Press
- Warren, R. M., "Perceptual processing of speech and other perceptual patterns: Some similarities and differences", in Greenberg S. and Ainsworth, W., Ed. Listening to Speech: An Auditory Perspective, Oxford University Press, 1998.
- (1998) Listening to Speech: An Auditory Perspective
- Warren, R.M.¹

4
- 84911473441
- Convolutional neural networks for speech recognition
- Abdel-Hamid, O., Mohamed, A., Jiang, H., Deng, L., Penn, G. and Yu, D., "Convolutional Neural Networks for Speech Recognition", IEEE/ACM Trans. Audio, Speech, and Lang. Proc., 22 (10): 1533-1545, 2014.
- (2014) IEEE/ACM Trans. Audio, Speech, and Lang. Proc. , vol.22 , Issue.10 , pp. 1533-1545
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Deng, L.⁴ Penn, G.⁵ Yu, D.⁶

5
- 84890525984
- Deep convolutional neural networks for LVCSR
- May
- T. N. Sainath, A. Mohamed, B. Kingsbury and B. Ramabhadran, "Deep convolutional neural networks for LVCSR, " in Proc. ICASSP, May 2013.
- (2013) Proc. ICASSP
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

6
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- Sainath, T., Kinsbury, B., Mohamed, A. and Ramabhadran, B., "Improvements to deep convolutional neural networks for LVCSR", in Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Sainath, T.¹ Kinsbury, B.² Mohamed, A.³ Ramabhadran, B.⁴

7
- 84906214784
- Exploring convolutional neural network structures and optimization techniques for speech recognition
- Abdel-Hamid, O., Deng, L. and Yu, D., "Exploring convolutional neural network structures and optimization techniques for speech recognition", in Proc. Interspeech, 3366-3370, 2013.
- (2013) Proc. Interspeech , pp. 3366-3370
- Abdel-Hamid, O.¹ Deng, L.² Yu, D.³

8
- 84906276981
- Convolutional deep rectifier neural nets for phone recognition
- Toth, L., "Convolutional deep rectifier neural nets for phone recognition", in Proc. Interspeech, 1722-1726, 2013.
- (2013) Proc. Interspeech , pp. 1722-1726
- Toth, L.¹

9
- 84858971297
- Convolutive Bottleneck Network features for LVCSR
- Vesely, K. ; Karafiat, M. ; Grezl, F., "Convolutive Bottleneck Network features for LVCSR, " IEEE Workshop on ASRU, 2011.
- (2011) IEEE Workshop on ASRU
- Vesely, K.¹ Karafiat, M.² Grezl, F.³

10
- 84905252069
- Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition
- Toth, L., "Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition", in Proc. ICASSP, 190-194, 2014.
- (2014) Proc. ICASSP , pp. 190-194
- Toth, L.¹

11
- 0028516073
- How do humans process and recognise speech
- Allen, J., "How Do Humans Process and Recognise Speech", IEEE Trans. Speech and Audio Proc., 2 (4): 567-577, 1994.
- (1994) IEEE Trans. Speech and Audio Proc. , vol.2 , Issue.4 , pp. 567-577
- Allen, J.¹

12
- 84892186467
- Incorporating information from syllable-length time scales into automatic speech recognition
- Wu S., Kingsbury B., Mongan N. and Greenberg S., "Incorporating information from syllable-length time scales into automatic speech recognition", in Proc. ICASSP, 721-724, 1998.
- (1998) Proc. ICASSP , pp. 721-724
- Wu, S.¹ Kingsbury, B.² Mongan, N.³ Greenberg, S.⁴

13
- 0031643048
- Multi-resolution cepstral features for phoneme recognition across speech sub-bands
- McCourt, P., Vaseghi, S. and Harte, N., "Multi-resolution cepstral features for phoneme recognition across speech sub-bands", in Proc. ICASSP, 557-560, 1998.
- (1998) Proc. ICASSP , pp. 557-560
- McCourt, P.¹ Vaseghi, S.² Harte, N.³

14
- 84959136712
- Microsoft Corporation, Redmond, WA, USA, accessed on 04 Mar
- "The Computational Network Toolkit (CNTK)", Microsoft Corporation, Redmond, WA, USA. Online: https: //cntk. codeplex. com/SourceControl/latest, accessed on 04 Mar. 2015.
- (2015) The Computational Network Toolkit (CNTK)

15
- 84976206655
- https: //catalog. ldc. upenn. edu/docs/LDC96S32/FFMTIMIT. TXT

16
- 0002263996
- Convolutional networks for images, speech and time series
- Arbib, M. A., Ed., MIT Press, 255-258
- LeCun, Y. and Bengio Y., "Convolutional networks for images, speech and time series", in Arbib, M. A., Ed., The Handbook of Brain Theory and Neural Networks, MIT Press, 255-258, 1995.
- (1995) The Handbook of Brain Theory and Neural Networks
- LeCun, Y.¹ Bengio, Y.²

17
- 84867605836
- Applying convolutional neural network concepts to hybrid NNHMM models for speech recognition
- Abdel-Hamid, O., Mohamed, A., Jiang, H. and Penn, G., "Applying convolutional neural network concepts to hybrid NNHMM models for speech recognition", in Proc. ICASSP, 4277-4280, 2012.
- (2012) Proc. ICASSP , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

18
- 0024768209
- Speaker-independent phone recognition using hidden markov models
- Lee, K. and Hon, H., "Speaker-Independent Phone Recognition Using Hidden Markov Models", IEEE Trans. Audio, Speech, Signal Proc. 37 (11): 1641-1648, 1989.
- (1989) IEEE Trans. Audio, Speech, Signal Proc. , vol.37 , Issue.11 , pp. 1641-1648
- Lee, K.¹ Hon, H.²

19
- 84959111923
- CUED Machine Intelligence Lab. Cambridge, UK. Online
- "The Hidden Markov Model Toolkit (HTK)", CUED Machine Intelligence Lab. Cambridge, UK. Online: http: //htk. eng. cam. ac. uk/ftp/software/HTK-3. 4. 1. Tar. gz, accessed on 28 Jun. 2013.
- (2013) The Hidden Markov Model Toolkit (HTK), Accessed on 28 Jun

20
- 78049271850
- Parallel training of neural networks for speech recognition
- Vesely, K., Burget, L. and Grezl, F., "Parallel training of neural networks for speech recognition", in Proc. International Conf. Text, Speech and Dialog, 439-446, 2010.
- (2010) Proc. International Conf. Text, Speech and Dialog , pp. 439-446
- Vesely, K.¹ Burget, L.² Grezl, F.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.