메뉴 건너뛰기




Volumn , Issue , 2014, Pages 3829-3833

On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis

Author keywords

DNN; HMM; Speech Synthesis; TTS

Indexed keywords

SPEECH SYNTHESIS;

EID: 84905251808     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2014.6854318     Document Type: Conference Paper
Times cited : (203)

References (22)
  • 1
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans, on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30-2, 2012.
    • (2012) IEEE Trans, on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 30-32
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 2
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-depedent deep neural networks
    • F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-depedent deep neural networks," in Proc. InterSpeech, pp. 437-40, 2011.
    • (2011) Proc. InterSpeech , pp. 437-440
    • Seide, F.1    Li, G.2    Yu, D.3
  • 4
    • 0033708106 scopus 로고    scopus 로고
    • Speech Parameter generation algorithms for HMM-based speech synthesis
    • K. Tokuda, T. Kobayashi, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech Parameter generation algorithms for HMM-based speech synthesis", InProc. ICASSP, pp. 1315-1318,2000.
    • (2000) Proc. ICASSP , pp. 1315-1318
    • Tokuda, K.1    Kobayashi, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 5
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and W. Black, Alan, "Statistical parametric speech synthesis", Speech Communication, Volume 51, Issue 11, pp. 1039-1064,2009.
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Alan, W.B.3
  • 6
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • H. Zen, A. Senior and M. Senior, "Statistical Parametric Speech Synthesis Using Deep Neural Networks", InProc. ICASSP, pp. 8012-8016,2013.
    • (2013) Proc. ICASSP , pp. 8012-8016
    • Zen, H.1    Senior, A.2    Senior, M.3
  • 7
    • 84929157442 scopus 로고    scopus 로고
    • Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis
    • H. Lu, S. King, and O. Watts, "Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis", In 8th ISCA Workshop on Speech Synthesis, pp. 281-285,2013.
    • (2013) 8th ISCA Workshop on Speech Synthesis , pp. 281-285
    • Lu, H.1    King, S.2    Watts, O.3
  • 8
    • 84890527090 scopus 로고    scopus 로고
    • Multi-distribution deep belief network for speech synthesis
    • S. Kang, X. Qian, and H. Meng, "Multi-distribution deep belief network for speech synthesis", In Proc. ICASSP, pp. 7962-7966, 2013.
    • (2013) Proc. ICASSP , pp. 7962-7966
    • Kang, S.1    Qian, X.2    Meng, H.3
  • 9
    • 84890447002 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis
    • Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis", InProc. ICASSP, pp. 7825-7829,2013.
    • (2013) Proc. ICASSP , pp. 7825-7829
    • Ling, Z.-H.1    Deng, L.2    Yu, D.3
  • 10
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G.E. Hinton, S. Osindero and Y. W. Teh, "A Fast Learning Algorithm for Deep Belief Nets," Neural Computation, vol. 18, no. 7, pp. 1527-1554,2006.
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.W.3
  • 11
    • 0022471098 scopus 로고
    • Learning representations by back-propagating errors
    • D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 9, pp. 533-536, 1986.
    • (1986) Nature , vol.323 , Issue.9 , pp. 533-536
    • Rumelhart, D.E.1    Hinton, G.E.2    Williams, R.J.3
  • 12
    • 0000029122 scopus 로고
    • A simple weight decay can improve generalization
    • J.E. Moody, S.J. Hanson and P.R. Lippmann, eds. Morgan Kauffmann Publishers, San Mateo CA
    • A. Krogh and J. A. Hertz, "A Simple Weight Decay Can Improve Generalization", in Advance in Neural Information Processing Systems-4, J.E. Moody, S.J. Hanson and P.R. Lippmann, eds. Morgan Kauffmann Publishers, San Mateo CA, pp. 950-957, 1992.
    • (1992) Advance in Neural Information Processing Systems , vol.4 , pp. 950-957
    • Krogh, A.1    Hertz, J.A.2
  • 15
    • 84055163920 scopus 로고    scopus 로고
    • Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition
    • D. Yu, L. Deng, and G. Dahl, "Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition," in NIPS Workshop, 2010.
    • (2010) NIPS Workshop
    • Yu, D.1    Deng, L.2    Dahl, G.3
  • 16
    • 84890453097 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in IEEE ASRU, 2011.
    • (2011) IEEE ASRU
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4
  • 19
    • 33846429403 scopus 로고    scopus 로고
    • Minimum generation error training for HMM-based speech synthesis
    • Y.-J. Wu and R.H. Wang, "Minimum generation error training for HMM-based speech synthesis", In Proc. ICASSP, 2006.
    • (2006) Proc. ICASSP
    • Wu, Y.-J.1    Wang, R.H.2
  • 22
    • 0033906251 scopus 로고    scopus 로고
    • MDL-based Context-Dependent sub-word modeling for speech recognition
    • K. Shinoda, and T. Watanable, "MDL-based Context-Dependent Sub-word Modeling for Speech Recognition", J. Acoust. Soc. Jpn(E), vol.21, no.2, pp.79-86,2000.
    • (2000) J. Acoust. Soc. Jpn(E) , vol.21 , Issue.2 , pp. 79-86
    • Shinoda, K.1    Watanable, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.