SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 6709-6713

Investigating deep neural network based transforms of robust audio features for LVCSR

(2) Bocchieri, Enrico a Dimitriadis, Dimitrios a

a AT AND T LABS RESEARCH (United States)

Author keywords

feature extraction; Neural networks; robustness; speech recognition

Indexed keywords

AUDIO FEATURES; CEPSTRAL FEATURES; DEEP NEURAL NETWORKS; FEATURE TRANSFORM; FORMANT FREQUENCY; FREQUENCY MEASURES; LARGE VOCABULARY; RELATIVE ERROR RATES;

FEATURE EXTRACTION; MODULATION; NEURAL NETWORKS; ROBUSTNESS (CONTROL SYSTEMS); SPEECH RECOGNITION;

MATHEMATICAL TRANSFORMATIONS;

EID: 84890543873 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6638960 Document Type: Conference Paper

Times cited : (8)

References (25)

1
- 84892178050
- Spectral subband centroid features for speech recognition
- Seattle, WA, May
- K. K. Paliwal, "Spectral subband centroid features for speech recognition", in Proc. ICASSP, Seattle, WA, May 1998, pp. 617-620
- (1998) Proc. ICASSP , pp. 617-620
- Paliwal, K.K.¹

2
- 42549139762
- MVa processing of speech features
- Jan
- Chia-Ping Chen and J. A. Bilmes, "MVA Processing of Speech Features" IEEE Trans. On Audio, Speech and Lang. Proc. Vol. 15, No. 1, Jan. 2007
- (2007) IEEE Trans. on Audio, Speech and Lang. Proc , vol.15 , Issue.1
- Chen, C.-P.¹ Bilmes, J.A.²

3
- 0028517164
- RAsta processing of speech
- Oct
- H. Hermansky and N. Morgan, "RASTA Processing of Speech", IEEE Trans. On Speech and Audio Processing, Vol. 2, No. 4, Oct. 1994
- (1994) IEEE Trans. on Speech and Audio Processing , vol.2 , Issue.4
- Hermansky, H.¹ Morgan, N.²

4
- 0033677121
- Maximum likelihood discriminant feature spaces
- G. Saon, M. Padmanabhan, R. Gopinath, and S.Chen, "Maximum likelihood discriminant feature spaces," in Proc. ICASSP-2000, pp. 1129-1132
- (2000) Proc. ICASSP , pp. 1129-1132
- Saon, G.¹ Padmanabhan, M.² Gopinath, R.³ Chen, S.⁴

5
- 0032289099
- Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition
- N. Kumar and G. Andreou, "Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition," Speech Communication, vol. 26, pp. 283-297, 1998
- (1998) Speech Communication , vol.26 , pp. 283-297
- Kumar, N.¹ Andreou, G.²

6
- 84892187452
- Maximum likelihood modeling with [Gaussian distributions for classifications
- R.A. Gopinath, "Maximum likelihood modeling with [Gaussian distributions for classifications," in Proc. ICASSP, 1998, pp. 661-664
- (1998) Proc. ICASSP , pp. 661-664
- Gopinath, R.A.¹

7
- 0003454539
- Tech. Rep. CUED/FINFENG/TR291, Cambridge Univ
- M.J.F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Tech. Rep. CUED/FINFENG/TR291, Cambridge Univ., 1997
- (1997) Maximum Likelihood Linear Transformations for HMM-based Speech Recognition
- Gales, M.J.F.¹

8
- 85009271609
- Towards automatic closed captioning: Low latency real time broadcast news transcriptions
- Sep
- M. Saraclar, M. Riley, E. Bocchieri, and V. Goffin, Towards automatic closed captioning: low latency real time broadcast news transcriptions", in Proc. International Conference on Spoken Language Processing (ICSLP), Sep. 2002, pp. 741-1744
- (2002) Proc. International Conference on Spoken Language Processing (ICSLP) , pp. 741-1744
- Saraclar, M.¹ Riley, M.² Bocchieri, E.³ Goffin, V.⁴

9
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- S.B. Davis, P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences", IEEE Trans. on Acoustic, Speech and Signal Processing, 28(4):357-366, 1980
- (1980) IEEE Trans. on Acoustic, Speech and Signal Processing , vol.28 , Issue.4 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

10
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. of Acoust. Soc. of America, vol. 87, no. 4, pp. 1738-1752, 1990
- (1990) J. of Acoust. Soc. of America , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

11
- 0007802346
- Tandem connectionist feature stream extraction for conventional HMM systems
- Istanbul, June
- H.Hermansky, D. Ellis and S.Sharma, "Tandem connectionist feature stream extraction for conventional HMM systems.", in Proc. of ICASSP-2000, Istanbul, June 2000
- (2000) Proc. of ICASSP-2000
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

12
- 84969139455
- Probabilistic and bottle-neck features for ASR of meetings
- F. Gre ?zl, M. Karafia ?t, S. Konta ?r and J. Cernocky, "Probabilistic and bottle-neck features for ASR of meetings", in Proc. of ICASSP-2007, pp 757-760
- (2007) Proc. of ICASSP , pp. 757-760
- Gre Zl, F.¹ Karafia, T.M.² Konta, R.S.³ Cernocky, J.⁴

13
- 84867209138
- Transcribing broadcast data using MLP features
- P.Fousek, L.Lamel and J.Gauvain, "Transcribing broadcast data using MLP features", in Proc. Interspeech 2008, pp 1433-1436
- (2008) Proc. Interspeech , pp. 1433-1436
- Fousek, P.¹ Lamel, L.² Gauvain, J.³

14
- 84865785753
- Improved bottleneck features using pretrained deep neural networks
- D.Yu and M.L.Seltzer,"Improved Bottleneck Features Using Pretrained Deep Neural Networks", in Proc. Interspeech 2011, pp 237-240
- (2011) Proc. Interspeech , pp. 237-240
- Yu, D.¹ Seltzer, M.L.²

15
- 84867593213
- Auto-encoder bottleneck features using deep belief networks
- T.N.Sainath, B.Kingsbury and B.Ramabhadran,"Auto-encoder bottleneck features using deep belief networks", in Proc. of ICASSP-2012
- (2012) Proc. of ICASSP
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³

16
- 84878403164
- Context-dependent mlps for lvcsr: Tandem, hybrid or both
- Portland, Oregon
- Z.Tüske, R.Schlüter, H. Ney, M.Sundermeyer, "Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?," Proc. INTERSPEECH 2012, Portland, Oregon
- (2012) Proc. INTERSPEECH
- Tüske, Z.¹ Schlüter, R.² Ney, H.³ Sundermeyer, M.⁴

17
- 84867119104
- last revised 22Dec 2011, arxiv.org/abs/1107.2490
- W. XU, "Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent", last revised 22Dec 2011, arxiv.org/abs/1107. 2490
- Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent
- Xu, W.¹

18
- 84904136037
- Large-scale machine learning with stochastic gradient descent
- Edited by Yves Lechevallier and Gilbert Saporta Paris, France, August, Springer
- L.Bottou, "Large-Scale Machine Learning with Stochastic Gradient Descent", Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT'2010), 177-187, Edited by Yves Lechevallier and Gilbert Saporta, Paris, France, August 2010, Springer
- (2010) Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT'2010 , pp. 177-187
- Bottou, L.¹

19
- 84943274699
- A direct adaptive method for faster backpropagation learning: The RPROP algorithm
- M. Riedmiller and H. Braun, "A direct adaptive method for faster backpropagation learning: the RPROP algorithm", in Proc. IEEE International Conference on Neural Networks, 1993, pp 586-591
- (1993) Proc. IEEE International Conference on Neural Networks , pp. 586-591
- Riedmiller, M.¹ Braun, H.²

20
- 0003126173
- Improving the rprop learning algorithm
- C. Igel and M. Hüsken, "Improving the Rprop Learning Algorithm", in Proc. Of The Second International Symposium On Neural Computation (NC2000)
- Proc. of the Second International Symposium on Neural Computation (NC2000)
- Igel, C.¹ Hüsken, M.²

21
- 27644455860
- Robust amfm features for speech recognition
- Sept
- D. Dimitriadis, P. Maragos, and A. Potamianos "Robust AMFM Features for Speech Recognition", IEEE Signal Processing Letters, Vol. 12, No. 9, Sept. 2005
- (2005) IEEE Signal Processing Letters , vol.12 , Issue.9
- Dimitriadis, D.¹ Maragos, P.² Potamianos, A.³

22
- 0030008906
- Speech formant frequency and bandwidth tracking using multiband energy demodulation
- Jun
- A. Potamianos and P. Maragos, "Speech formant frequency and bandwidth tracking using multiband energy demodulation," J. Acoust. Soc.Amer., vol. 99, no. 6, pp. 3795-3806, Jun. 1996
- (1996) J. Acoust. Soc.Amer. , vol.99 , Issue.6 , pp. 3795-3806
- Potamianos, A.¹ Maragos, P.²

23
- 80051615494
- Speech recognition modeling advances for mobile voice search
- E.Bocchieri, D.Caseiro and D.Dimitriadis, "Speech recognition modeling advances for mobile voice search,",in Proc. ICASSP, 2011, pp 4888-4891
- (2011) Proc. ICASSP , pp. 4888-4891
- Bocchieri, E.¹ Caseiro, D.² Dimitriadis, D.³

24
- 80051650297
- An alternative front-end for the at&t watson lv-csr system
- D.Dimitriadis, E.Bocchieri and D.Caseiro, "An alternative front-end for the AT&T WATSON LV-CSR system", in Proc. ICASSP, 2011
- (2011) Proc. ICASSP
- Dimitriadis, D.¹ Bocchieri, E.² Caseiro, D.³

25
- 33744998457
- Continuous energy demodulation methods and application to speech analysis
- July
- D. Dimitriadis and P. Maragos, "Continuous Energy Demodulation Methods and Application to Speech Analysis", Speech Communication, vol.48, no.7, pp.819-837, July 2006.
- (2006) Speech Communication , vol.48 , Issue.7 , pp. 819-837
- Dimitriadis, D.¹ Maragos, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.