메뉴 건너뛰기




Volumn , Issue , 2013, Pages 6709-6713

Investigating deep neural network based transforms of robust audio features for LVCSR

Author keywords

feature extraction; Neural networks; robustness; speech recognition

Indexed keywords

AUDIO FEATURES; CEPSTRAL FEATURES; DEEP NEURAL NETWORKS; FEATURE TRANSFORM; FORMANT FREQUENCY; FREQUENCY MEASURES; LARGE VOCABULARY; RELATIVE ERROR RATES;

EID: 84890543873     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2013.6638960     Document Type: Conference Paper
Times cited : (8)

References (25)
  • 1
    • 84892178050 scopus 로고    scopus 로고
    • Spectral subband centroid features for speech recognition
    • Seattle, WA, May
    • K. K. Paliwal, "Spectral subband centroid features for speech recognition", in Proc. ICASSP, Seattle, WA, May 1998, pp. 617-620
    • (1998) Proc. ICASSP , pp. 617-620
    • Paliwal, K.K.1
  • 4
  • 5
    • 0032289099 scopus 로고    scopus 로고
    • Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition
    • N. Kumar and G. Andreou, "Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition," Speech Communication, vol. 26, pp. 283-297, 1998
    • (1998) Speech Communication , vol.26 , pp. 283-297
    • Kumar, N.1    Andreou, G.2
  • 6
    • 84892187452 scopus 로고    scopus 로고
    • Maximum likelihood modeling with [Gaussian distributions for classifications
    • R.A. Gopinath, "Maximum likelihood modeling with [Gaussian distributions for classifications," in Proc. ICASSP, 1998, pp. 661-664
    • (1998) Proc. ICASSP , pp. 661-664
    • Gopinath, R.A.1
  • 9
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • S.B. Davis, P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences", IEEE Trans. on Acoustic, Speech and Signal Processing, 28(4):357-366, 1980
    • (1980) IEEE Trans. on Acoustic, Speech and Signal Processing , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 10
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. of Acoust. Soc. of America, vol. 87, no. 4, pp. 1738-1752, 1990
    • (1990) J. of Acoust. Soc. of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 11
    • 0007802346 scopus 로고    scopus 로고
    • Tandem connectionist feature stream extraction for conventional HMM systems
    • Istanbul, June
    • H.Hermansky, D. Ellis and S.Sharma, "Tandem connectionist feature stream extraction for conventional HMM systems.", in Proc. of ICASSP-2000, Istanbul, June 2000
    • (2000) Proc. of ICASSP-2000
    • Hermansky, H.1    Ellis, D.2    Sharma, S.3
  • 12
    • 84969139455 scopus 로고    scopus 로고
    • Probabilistic and bottle-neck features for ASR of meetings
    • F. Gre ?zl, M. Karafia ?t, S. Konta ?r and J. Cernocky, "Probabilistic and bottle-neck features for ASR of meetings", in Proc. of ICASSP-2007, pp 757-760
    • (2007) Proc. of ICASSP , pp. 757-760
    • Gre Zl, F.1    Karafia, T.M.2    Konta, R.S.3    Cernocky, J.4
  • 13
    • 84867209138 scopus 로고    scopus 로고
    • Transcribing broadcast data using MLP features
    • P.Fousek, L.Lamel and J.Gauvain, "Transcribing broadcast data using MLP features", in Proc. Interspeech 2008, pp 1433-1436
    • (2008) Proc. Interspeech , pp. 1433-1436
    • Fousek, P.1    Lamel, L.2    Gauvain, J.3
  • 14
    • 84865785753 scopus 로고    scopus 로고
    • Improved bottleneck features using pretrained deep neural networks
    • D.Yu and M.L.Seltzer,"Improved Bottleneck Features Using Pretrained Deep Neural Networks", in Proc. Interspeech 2011, pp 237-240
    • (2011) Proc. Interspeech , pp. 237-240
    • Yu, D.1    Seltzer, M.L.2
  • 18
    • 84904136037 scopus 로고    scopus 로고
    • Large-scale machine learning with stochastic gradient descent
    • Edited by Yves Lechevallier and Gilbert Saporta Paris, France, August, Springer
    • L.Bottou, "Large-Scale Machine Learning with Stochastic Gradient Descent", Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT'2010), 177-187, Edited by Yves Lechevallier and Gilbert Saporta, Paris, France, August 2010, Springer
    • (2010) Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT'2010 , pp. 177-187
    • Bottou, L.1
  • 22
    • 0030008906 scopus 로고    scopus 로고
    • Speech formant frequency and bandwidth tracking using multiband energy demodulation
    • Jun
    • A. Potamianos and P. Maragos, "Speech formant frequency and bandwidth tracking using multiband energy demodulation," J. Acoust. Soc.Amer., vol. 99, no. 6, pp. 3795-3806, Jun. 1996
    • (1996) J. Acoust. Soc.Amer. , vol.99 , Issue.6 , pp. 3795-3806
    • Potamianos, A.1    Maragos, P.2
  • 23
    • 80051615494 scopus 로고    scopus 로고
    • Speech recognition modeling advances for mobile voice search
    • E.Bocchieri, D.Caseiro and D.Dimitriadis, "Speech recognition modeling advances for mobile voice search,",in Proc. ICASSP, 2011, pp 4888-4891
    • (2011) Proc. ICASSP , pp. 4888-4891
    • Bocchieri, E.1    Caseiro, D.2    Dimitriadis, D.3
  • 24
  • 25
    • 33744998457 scopus 로고    scopus 로고
    • Continuous energy demodulation methods and application to speech analysis
    • July
    • D. Dimitriadis and P. Maragos, "Continuous Energy Demodulation Methods and Application to Speech Analysis", Speech Communication, vol.48, no.7, pp.819-837, July 2006.
    • (2006) Speech Communication , vol.48 , Issue.7 , pp. 819-837
    • Dimitriadis, D.1    Maragos, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.