SCOPUS 정보 검색 플랫폼

2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings

Volumn , Issue , 2012, Pages 131-136

Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM

(4) Li, Jinyu a Yu, Dong a Huang, Jui Ting a Gong, Yifan a

a MICROSOFT (United States)

Author keywords

CD DNN HMM; deep neural network; log filter bank; mixed bandwidth; narrowband; wideband

Indexed keywords

CD-DNN-HMM; DEEP NEURAL NETWORKS; MIXED-BANDWIDTH; NARROW BANDS; WIDE-BAND;

FILTER BANKS; HIDDEN MARKOV MODELS; NEURAL NETWORKS; SPEECH RECOGNITION;

BANDWIDTH;

EID: 84874282188 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/SLT.2012.6424210 Document Type: Conference Paper

Times cited : (153)

References (22)

1
- 84055163920
- Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition
- Dec
- D. Yu, L. Deng, and G. Dahl, "Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition," in Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Dec. 2010.
- (2010) Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning
- Yu, D.¹ Deng, L.² Dahl, G.³

2
- 84055222005
- Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
- G. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large vocabulary speech recognition," IEEE Trans. Speech and Audio Proc., vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. Speech and Audio Proc , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

3
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," in Proc. Interspeech, 2011.
- (2011) Proc. Interspeech
- Seide, F.¹ Li, G.² Yu, D.³

4
- 84874235393
- Why deep neural networks are promising for large vocabulary speech recognition
- submitted to
- D. Yu, F. Seide, G. Li, J. Li, and M. Seltzer, "Why deep neural networks are promising for large vocabulary speech recognition," submitted to IEEE Trans. on Audio, Speech, and Language Processing, 2012.
- (2012) IEEE Trans. on Audio, Speech, and Language Processing
- Yu, D.¹ Seide, F.² Li, G.³ Li, J.⁴ Seltzer, M.⁵

5
- 84999742323
- An application of pretrained deep neural networks to large vocabulary conversational speech recognition
- Department of Computer Science, University of Toronto
- N. Jaitly, P. Nguyen, A. Senior, and V. Vanhoucke, "An application of pretrained deep neural networks to large vocabulary conversational speech recognition," Tech. Rep. 001, Department of Computer Science, University of Toronto, 2012.
- (2012) Tech. Rep 001
- Jaitly, N.¹ Nguyen, P.² Senior, A.³ Vanhoucke, V.⁴

6
- 84867754964
- Improvements in using deep belief networks for large vocabulary continuous speech recognition
- Speech and Language Algorithm Group, IBM, February 2011
- T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Improvements in using deep belief networks for large vocabulary continuous speech recognition," Tech. Rep. UTML TR 2010-003, Speech and Language Algorithm Group, IBM, February 2011
- Tech. Rep. UTML TR 2010-003
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³

7
- 84858972572
- Making deep belief networks effective for large vocabulary continuous speech recognition
- T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, A.-r. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition", in Proc. ASRU 2011, pp. 30-35.
- (2011) Proc. ASRU , pp. 30-35
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, A.-R.⁵ Mohamed, P.⁶

8
- 44049108531
- Automated directory assistance system-from theory to practice
- D. Yu, Y. C. Ju, Y. Y. Wang, G. Zweig, and A. Acero, "Automated directory assistance system-from theory to practice," in Proc. Interspeech, 2007, pp. 2709-2711.
- (2007) Proc. Interspeech , pp. 2709-2711
- Yu, D.¹ Ju, Y.C.² Wang, Y.Y.³ Zweig, G.⁴ Acero, A.⁵

9
- 85079086476
- Sources of degradation of speech recognition in the telephone network
- Adelaide, Australia Apr
- P. Moreno and R. M. Stern, "Sources of degradation of speech recognition in the telephone network," in Proc. ICASSP, Adelaide, Australia, vol. I, pp.109-112, Apr. 1994.
- (1994) Proc. ICASSP , vol.1 , pp. 109-112
- Moreno, P.¹ Stern, R.M.²

10
- 0004056285
- Prentice-Hall, May
- X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing, Prentice-Hall, May 2001.
- (2001) Spoken Language Processing
- Huang, X.¹ Acero, A.² Hon, H.-W.³

11
- 64149084747
- Training wideband acoustic models using mixed-bandwidth training data for speech recognition
- M. L. Seltzer and A. Acero, "Training wideband acoustic models using mixed-bandwidth training data for speech recognition", IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, pp. 235-245, 2007.
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.1 , pp. 235-245
- Seltzer, M.L.¹ Acero, A.²

12
- 33745199156
- Robust bandwidth extension of noise-corrupted narrowband speech
- M. L. Seltzer, A. Acero, and J. Droppo, "Robust bandwidth extension of noise-corrupted narrowband speech," in Proc. Interspeech, pp. 1509-1512, 2005.
- (2005) Proc. Interspeech , pp. 1509-1512
- Seltzer, M.L.¹ Acero, A.² Droppo, J.³

13
- 0028517647
- Statistical recovery of wideband speech from narrowband speech
- Oct
- Y. M. Cheng, D. O'Shaughnessy, and P. Mermelstein, "Statistical recovery of wideband speech from narrowband speech," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 544-548, Oct. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.4 , pp. 544-548
- Cheng, Y.M.¹ O'Shaughnessy, D.² Mermelstein, P.³

14
- 0033692729
- Narrowband to wideband conversion of speech using GMM based transformation
- Istanbul, Turkey Jun
- K.-Y. Park and H. S. Kim, "Narrowband to wideband conversion of speech using GMM based transformation," in Proc. ICASSP, Istanbul, Turkey, Jun. 2000, vol. 3, pp. 1843-1846.
- (2000) Proc. ICASSP , vol.3 , pp. 1843-1846
- Park, K.-Y.¹ Kim, H.S.²

15
- 84951992170
- Wideband extension of telephone speech using a hidden Markov model
- Delavan, WI, Sep
- P. Jax and P. Vary, "Wideband extension of telephone speech using a hidden Markov model," in IEEE Workshop on Speech Coding, Delavan, WI, Sep. 2000, pp. 133-135.
- (2000) IEEE Workshop on Speech Coding , pp. 133-135
- Jax, P.¹ Vary, P.²

16
- 84867754966
- Improving the speed of neural networks on CPUs
- A. Senior V. Vanhoucke and M. Z. Mao (2011), "Improving the speed of neural networks on CPUs," in Proc. Deep Learning and Unsupervised Feature Learning Workshop, NIPS, 2011.
- (2011) Proc. Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011
- Senior, A.¹ Vanhoucke, V.² Mao, M.Z.³

17
- 84867585919
- Understanding how deep belief networks perform acoustic modelling
- A. Mohamed, G. Hinton, and G. Penn, "Understanding how deep belief networks perform acoustic modelling", in Proc. ICASSP, pp. 4273-4276, 2012.
- (2012) Proc. ICASSP , pp. 4273-4276
- Mohamed, A.¹ Hinton, G.² Penn, G.³

18
- 33646788786
- FMPE: Discriminatively trained features for speech recognition
- D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau and G. Zweig, "fMPE: discriminatively trained features for speech recognition," in Pro. ICASSP, 2005.
- (2005) Pro. ICASSP
- Povey, D.¹ Kingsbury, B.² Mangu, L.³ Saon, G.⁴ Soltau, H.⁵ Zweig, G.⁶

19
- 51449120120
- Boosted MMI for model and feature space discriminative training
- D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon and K. Visweswariah, "Boosted MMI for model and feature space discriminative training", in Proc. ICASSP, 2008
- (2008) Proc. ICASSP
- Povey, D.¹ Kanevsky, D.² Kingsbury, B.³ Ramabhadran, B.⁴ Saon, G.⁵ Visweswariah, K.⁶

20
- 79959831132
- Investigation of fullsequence training of deep belief networks for speech recognition
- A. Mohamed, D. Yu, and L. Deng, "Investigation of fullsequence training of deep belief networks for speech recognition", in Proc. Interspeech 2010, pp. 1692-1695.
- (2010) Proc. Interspeech , pp. 1692-1695
- Mohamed, A.¹ Yu, D.² Deng, L.³

21
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling," in Proc. ICASSP 2009, pp. 3761-3764.
- (2009) Proc. ICASSP , pp. 3761-3764
- Kingsbury, B.¹

22
- 80051623709
- Joint encoding of the waveform and speech recognition features using a transform codec
- May
- X. Fan, M. Seltzer, J. Droppo, H. Malvar, and A. Acero, "Joint encoding of the waveform and speech recognition features using a transform codec," in Proc. ICASSP, pp.5148-5151, May 2011.
- (2011) Proc. ICASSP , pp. 5148-5151
- Fan, X.¹ Seltzer, M.² Droppo, J.³ Malvar, H.⁴ Acero, A.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.