SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 3829-3833

On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis

(4) Qian, Yao a Fan, Yuchen a Hu, Wenping a Soong, Frank K a

a MICROSOFT RESEARCH (United States)

Author keywords

DNN; HMM; Speech Synthesis; TTS

Indexed keywords

SPEECH SYNTHESIS;

ACTIVATION FUNCTIONS; DEEP NEURAL NETWORKS; DNN; GAUSSIAN PROBABILITY; HMM; KEY CHARACTERISTICS; OBJECTIVE AND SUBJECTIVE MEASURES; TTS;

SIGNAL PROCESSING;

EID: 84905251808 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854318 Document Type: Conference Paper

Times cited : (203)

References (22)

1
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans, on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30-2, 2012.
- (2012) IEEE Trans, on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 30-32
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

2
- 84865801985
- Conversational speech transcription using context-depedent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-depedent deep neural networks," in Proc. InterSpeech, pp. 437-40, 2011.
- (2011) Proc. InterSpeech , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

3
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97,2012.
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

4
- 0033708106
- Speech Parameter generation algorithms for HMM-based speech synthesis
- K. Tokuda, T. Kobayashi, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech Parameter generation algorithms for HMM-based speech synthesis", InProc. ICASSP, pp. 1315-1318,2000.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Kobayashi, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

5
- 67651002140
- Statistical parametric speech synthesis
- H. Zen, K. Tokuda, and W. Black, Alan, "Statistical parametric speech synthesis", Speech Communication, Volume 51, Issue 11, pp. 1039-1064,2009.
- (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Alan, W.B.³

6
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- H. Zen, A. Senior and M. Senior, "Statistical Parametric Speech Synthesis Using Deep Neural Networks", InProc. ICASSP, pp. 8012-8016,2013.
- (2013) Proc. ICASSP , pp. 8012-8016
- Zen, H.¹ Senior, A.² Senior, M.³

7
- 84929157442
- Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis
- H. Lu, S. King, and O. Watts, "Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis", In 8th ISCA Workshop on Speech Synthesis, pp. 281-285,2013.
- (2013) 8th ISCA Workshop on Speech Synthesis , pp. 281-285
- Lu, H.¹ King, S.² Watts, O.³

8
- 84890527090
- Multi-distribution deep belief network for speech synthesis
- S. Kang, X. Qian, and H. Meng, "Multi-distribution deep belief network for speech synthesis", In Proc. ICASSP, pp. 7962-7966, 2013.
- (2013) Proc. ICASSP , pp. 7962-7966
- Kang, S.¹ Qian, X.² Meng, H.³

9
- 84890447002
- Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis
- Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis", InProc. ICASSP, pp. 7825-7829,2013.
- (2013) Proc. ICASSP , pp. 7825-7829
- Ling, Z.-H.¹ Deng, L.² Yu, D.³

10
- 33745805403
- A fast learning algorithm for deep belief nets
- G.E. Hinton, S. Osindero and Y. W. Teh, "A Fast Learning Algorithm for Deep Belief Nets," Neural Computation, vol. 18, no. 7, pp. 1527-1554,2006.
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.W.³

11
- 0022471098
- Learning representations by back-propagating errors
- D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 9, pp. 533-536, 1986.
- (1986) Nature , vol.323 , Issue.9 , pp. 533-536
- Rumelhart, D.E.¹ Hinton, G.E.² Williams, R.J.³

12
- 0000029122
- A simple weight decay can improve generalization
- J.E. Moody, S.J. Hanson and P.R. Lippmann, eds. Morgan Kauffmann Publishers, San Mateo CA
- A. Krogh and J. A. Hertz, "A Simple Weight Decay Can Improve Generalization", in Advance in Neural Information Processing Systems-4, J.E. Moody, S.J. Hanson and P.R. Lippmann, eds. Morgan Kauffmann Publishers, San Mateo CA, pp. 950-957, 1992.
- (1992) Advance in Neural Information Processing Systems , vol.4 , pp. 950-957
- Krogh, A.¹ Hertz, J.A.²

13
- 0003573244
- Kluwer Academic Publishers, Norwell, MA, USA
- H. Bourlard and N. Morgan, Connectionist Speech Recognition: A Hybrid Approach, Kluwer Academic Publishers, Norwell, MA, USA, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

14
- 77949522811
- Why does unsupervised pre-training help deep learning?
- D. Erhan, Y. Bengio, A. Courville, P.A. Manzagol, P. Vincent, and S. Bengio, "Why does unsupervised pre-training help deep learning," JMLR, 2010.
- (2010) JMLR
- Erhan, D.¹ Bengio, Y.² Courville, A.³ Manzagol, P.A.⁴ Vincent, P.⁵ Bengio, S.⁶

15
- 84055163920
- Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition
- D. Yu, L. Deng, and G. Dahl, "Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition," in NIPS Workshop, 2010.
- (2010) NIPS Workshop
- Yu, D.¹ Deng, L.² Dahl, G.³

16
- 84890453097
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in IEEE ASRU, 2011.
- (2011) IEEE ASRU
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

17
- 84890455972
- Making deep belief networks effective for large vocabulary continuous speech recognition
- T.N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A.R. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition," in IEEE ASRU, 2011
- (2011) IEEE ASRU
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, P.⁵ Mohamed, A.R.⁶

18
- 69349090197
- Learning deep architectures for AI
- Y. Bengio, "Learning deep architectures for AI," Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1-127,2009.
- (2009) Foundations and Trends in Machine Learning , vol.2 , Issue.1 , pp. 1-127
- Bengio, Y.¹

19
- 33846429403
- Minimum generation error training for HMM-based speech synthesis
- Y.-J. Wu and R.H. Wang, "Minimum generation error training for HMM-based speech synthesis", In Proc. ICASSP, 2006.
- (2006) Proc. ICASSP
- Wu, Y.-J.¹ Wang, R.H.²

20
- 84905283451
- New methods in continuous Mandarin speech recognition
- ISCA
- C. Julian Chen, Ramesh A. Gopinath, Michael D. Monkowski, Michael A. Picheny, and Katherine Shen, "New methods in continuous Mandarin speech recognition.," in EUROSPEECH. 1997, ISCA.
- (1997) EUROSPEECH
- Julian Chen, C.¹ Gopinath, R.A.² Monkowski, M.D.³ Picheny, M.A.⁴ Shen, K.⁵

21
- 67650851754
- USTC system for blizzard challenge 2006 an improved hmm-based speech synthesis method
- Z.-H. Ling, Y.-J. Wu, Y.-P. Wang, L. Qin, and R.-H. Wang, "USTC System for Blizzard Challenge 2006 an Improved HMM-based Speech Synthesis Method," Proc. Blizzard Challenge 2006 Workshop, 2006.
- (2006) Proc. Blizzard Challenge 2006 Workshop
- Ling, Z.-H.¹ Wu, Y.-J.² Wang, Y.-P.³ Qin, L.⁴ Wang, R.-H.⁵

22
- 0033906251
- MDL-based Context-Dependent sub-word modeling for speech recognition
- K. Shinoda, and T. Watanable, "MDL-based Context-Dependent Sub-word Modeling for Speech Recognition", J. Acoust. Soc. Jpn(E), vol.21, no.2, pp.79-86,2000.
- (2000) J. Acoust. Soc. Jpn(E) , vol.21 , Issue.2 , pp. 79-86
- Shinoda, K.¹ Watanable, T.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.