메뉴 건너뛰기




Volumn , Issue , 2012, Pages 131-136

Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM

Author keywords

CD DNN HMM; deep neural network; log filter bank; mixed bandwidth; narrowband; wideband

Indexed keywords

CD-DNN-HMM; DEEP NEURAL NETWORKS; MIXED-BANDWIDTH; NARROW BANDS; WIDE-BAND;

EID: 84874282188     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/SLT.2012.6424210     Document Type: Conference Paper
Times cited : (153)

References (22)
  • 2
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
    • G. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large vocabulary speech recognition," IEEE Trans. Speech and Audio Proc., vol. 20, no. 1, pp. 30-42, 2012.
    • (2012) IEEE Trans. Speech and Audio Proc , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 3
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," in Proc. Interspeech, 2011.
    • (2011) Proc. Interspeech
    • Seide, F.1    Li, G.2    Yu, D.3
  • 5
    • 84999742323 scopus 로고    scopus 로고
    • An application of pretrained deep neural networks to large vocabulary conversational speech recognition
    • Department of Computer Science, University of Toronto
    • N. Jaitly, P. Nguyen, A. Senior, and V. Vanhoucke, "An application of pretrained deep neural networks to large vocabulary conversational speech recognition," Tech. Rep. 001, Department of Computer Science, University of Toronto, 2012.
    • (2012) Tech. Rep 001
    • Jaitly, N.1    Nguyen, P.2    Senior, A.3    Vanhoucke, V.4
  • 6
    • 84867754964 scopus 로고    scopus 로고
    • Improvements in using deep belief networks for large vocabulary continuous speech recognition
    • Speech and Language Algorithm Group, IBM, February 2011
    • T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Improvements in using deep belief networks for large vocabulary continuous speech recognition," Tech. Rep. UTML TR 2010-003, Speech and Language Algorithm Group, IBM, February 2011
    • Tech. Rep. UTML TR 2010-003
    • Sainath, T.N.1    Kingsbury, B.2    Ramabhadran, B.3
  • 8
    • 44049108531 scopus 로고    scopus 로고
    • Automated directory assistance system-from theory to practice
    • D. Yu, Y. C. Ju, Y. Y. Wang, G. Zweig, and A. Acero, "Automated directory assistance system-from theory to practice," in Proc. Interspeech, 2007, pp. 2709-2711.
    • (2007) Proc. Interspeech , pp. 2709-2711
    • Yu, D.1    Ju, Y.C.2    Wang, Y.Y.3    Zweig, G.4    Acero, A.5
  • 9
    • 85079086476 scopus 로고
    • Sources of degradation of speech recognition in the telephone network
    • Adelaide, Australia Apr
    • P. Moreno and R. M. Stern, "Sources of degradation of speech recognition in the telephone network," in Proc. ICASSP, Adelaide, Australia, vol. I, pp.109-112, Apr. 1994.
    • (1994) Proc. ICASSP , vol.1 , pp. 109-112
    • Moreno, P.1    Stern, R.M.2
  • 11
    • 64149084747 scopus 로고    scopus 로고
    • Training wideband acoustic models using mixed-bandwidth training data for speech recognition
    • M. L. Seltzer and A. Acero, "Training wideband acoustic models using mixed-bandwidth training data for speech recognition", IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, pp. 235-245, 2007.
    • (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.1 , pp. 235-245
    • Seltzer, M.L.1    Acero, A.2
  • 12
    • 33745199156 scopus 로고    scopus 로고
    • Robust bandwidth extension of noise-corrupted narrowband speech
    • M. L. Seltzer, A. Acero, and J. Droppo, "Robust bandwidth extension of noise-corrupted narrowband speech," in Proc. Interspeech, pp. 1509-1512, 2005.
    • (2005) Proc. Interspeech , pp. 1509-1512
    • Seltzer, M.L.1    Acero, A.2    Droppo, J.3
  • 13
    • 0028517647 scopus 로고
    • Statistical recovery of wideband speech from narrowband speech
    • Oct
    • Y. M. Cheng, D. O'Shaughnessy, and P. Mermelstein, "Statistical recovery of wideband speech from narrowband speech," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 544-548, Oct. 1994.
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.4 , pp. 544-548
    • Cheng, Y.M.1    O'Shaughnessy, D.2    Mermelstein, P.3
  • 14
    • 0033692729 scopus 로고    scopus 로고
    • Narrowband to wideband conversion of speech using GMM based transformation
    • Istanbul, Turkey Jun
    • K.-Y. Park and H. S. Kim, "Narrowband to wideband conversion of speech using GMM based transformation," in Proc. ICASSP, Istanbul, Turkey, Jun. 2000, vol. 3, pp. 1843-1846.
    • (2000) Proc. ICASSP , vol.3 , pp. 1843-1846
    • Park, K.-Y.1    Kim, H.S.2
  • 15
    • 84951992170 scopus 로고    scopus 로고
    • Wideband extension of telephone speech using a hidden Markov model
    • Delavan, WI, Sep
    • P. Jax and P. Vary, "Wideband extension of telephone speech using a hidden Markov model," in IEEE Workshop on Speech Coding, Delavan, WI, Sep. 2000, pp. 133-135.
    • (2000) IEEE Workshop on Speech Coding , pp. 133-135
    • Jax, P.1    Vary, P.2
  • 17
    • 84867585919 scopus 로고    scopus 로고
    • Understanding how deep belief networks perform acoustic modelling
    • A. Mohamed, G. Hinton, and G. Penn, "Understanding how deep belief networks perform acoustic modelling", in Proc. ICASSP, pp. 4273-4276, 2012.
    • (2012) Proc. ICASSP , pp. 4273-4276
    • Mohamed, A.1    Hinton, G.2    Penn, G.3
  • 20
    • 79959831132 scopus 로고    scopus 로고
    • Investigation of fullsequence training of deep belief networks for speech recognition
    • A. Mohamed, D. Yu, and L. Deng, "Investigation of fullsequence training of deep belief networks for speech recognition", in Proc. Interspeech 2010, pp. 1692-1695.
    • (2010) Proc. Interspeech , pp. 1692-1695
    • Mohamed, A.1    Yu, D.2    Deng, L.3
  • 21
    • 70349213445 scopus 로고    scopus 로고
    • Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
    • B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling," in Proc. ICASSP 2009, pp. 3761-3764.
    • (2009) Proc. ICASSP , pp. 3761-3764
    • Kingsbury, B.1
  • 22
    • 80051623709 scopus 로고    scopus 로고
    • Joint encoding of the waveform and speech recognition features using a transform codec
    • May
    • X. Fan, M. Seltzer, J. Droppo, H. Malvar, and A. Acero, "Joint encoding of the waveform and speech recognition features using a transform codec," in Proc. ICASSP, pp.5148-5151, May 2011.
    • (2011) Proc. ICASSP , pp. 5148-5151
    • Fan, X.1    Seltzer, M.2    Droppo, J.3    Malvar, H.4    Acero, A.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.