메뉴 건너뛰기




Volumn 2016-May, Issue , 2016, Pages 5125-5129

Modeling spectral envelopes using deep conditional restricted Boltzmann machines for statistical parametric speech synthesis

Author keywords

deep neural network; hidden Markov model; restricted Boltzmann machine; spectral envelope; Speech synthesis

Indexed keywords


EID: 84973365185     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2016.7472654     Document Type: Conference Paper
Times cited : (5)

References (17)
  • 1
    • 84876687945 scopus 로고    scopus 로고
    • Speech synthesis based on hidden Markov models
    • K. Tokuda, Y. Nankaku, T. Toda, H. Zen, H. Yamagishi, and K. Oura, "Speech synthesis based on hidden Markov models, " Proc. IEEE, vol. 101, no. 5, pp. 1234-1252, 2013.
    • (2013) Proc. IEEE , vol.101 , Issue.5 , pp. 1234-1252
    • Tokuda, K.1    Nankaku, Y.2    Toda, T.3    Zen, H.4    Yamagishi, H.5    Oura, K.6
  • 3
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-based speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis, " in Proc. of ICASSP, 2000, vol. 3, pp. 1315-1318.
    • (2000) Proc. of ICASSP , vol.3 , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 5
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis, " Speech Commun., vol. 51, pp. 1039-1064, 2009.
    • (2009) Speech Commun. , vol.51 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3
  • 6
    • 84890447002 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis
    • Z.-H. Ling, D. Li, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis, " in Proc. of ICASSP, 2013, pp. 7825-7829.
    • (2013) Proc. of ICASSP , pp. 7825-7829
    • Ling, Z.-H.1    Li, D.2    Yu, D.3
  • 7
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis, " IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 10, pp. 2129-2139, 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , Issue.10 , pp. 2129-2139
    • Ling, Z.-H.1    Deng, L.2    Yu, D.3
  • 8
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks, " in Proc. of ICASSP, 2013, pp. 7962-7966.
    • (2013) Proc. of ICASSP , pp. 7962-7966
    • Zen, H.1    Senior, A.2    Schuster, M.3
  • 9
    • 84905262874 scopus 로고    scopus 로고
    • Deep mixture density network for acoustic modeling in statistical parametric speech synthesis
    • H. Zen and A. Senior, "Deep mixture density network for acoustic modeling in statistical parametric speech synthesis, " in Proc. of ICASSP, 2014, pp. 3872-3876.
    • (2014) Proc. of ICASSP , pp. 3872-3876
    • Zen, H.1    Senior, A.2
  • 10
    • 84864026688 scopus 로고    scopus 로고
    • Modeling human motion using binary latent variables
    • G.-W. Taylor, G.-E. Hinton, and S. Roweis, "Modeling human motion using binary latent variables, " in Proc. of NIPS, 2007, pp. 1345-1352.
    • (2007) Proc. of NIPS , pp. 1345-1352
    • Taylor, G.-W.1    Hinton, G.-E.2    Roweis, S.3
  • 11
    • 84889579519 scopus 로고    scopus 로고
    • Conditional restricted Boltzmann machine for voice conversion
    • Z.-Z. Wu, E.-S Chng, and H.-Z. Li, "Conditional restricted Boltzmann machine for voice conversion, " in Proc. of ChinaSIP, 2013, pp. 104-108.
    • (2013) Proc. of ChinaSIP , pp. 104-108
    • Wu, Z.-Z.1    Chng, E.-S.2    Li, H.-Z.3
  • 12
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G.-E. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets, " Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.-E.1    Osindero, S.2    Teh, Y.3
  • 13
    • 69349090197 scopus 로고    scopus 로고
    • Learning deep architectures for AI
    • Jan.
    • Yoshua Bengio, "Learning deep architectures for AI, " Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1-127, Jan. 2009.
    • (2009) Foundations and Trends in Machine Learning , vol.2 , Issue.1 , pp. 1-127
    • Bengio, Y.1
  • 14
    • 0022471098 scopus 로고
    • Learning representations by back-propagation erros
    • D. Rumelhart, G.-E. Hinton, and R. Willams, "Learning representations by back-propagation erros, " Nature, vol. 323, no. 6088, pp. 533-536, 1986.
    • (1986) Nature , vol.323 , Issue.6088 , pp. 533-536
    • Rumelhart, D.1    Hinton, G.-E.2    Willams, R.3
  • 15
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequencybased F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequencybased F0 extraction: Possible role of a repetitive structure in sounds, " Speech Communication, vol. 27, no. 3, pp. 187-208, 1999.
    • (1999) Speech Communication , vol.27 , Issue.3 , pp. 187-208
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigné, A.3
  • 16
    • 0542366491 scopus 로고
    • Efficient vector quantization of LPC parameters at 24 bits/frame
    • K.-K. Paliwal and B.-S. Atal, "Efficient vector quantization of LPC parameters at 24 bits/frame, " IEEE Trans. Speech Audio Process., vol. 1, no. 1, pp. 3-14, 1993.
    • (1993) IEEE Trans. Speech Audio Process. , vol.1 , Issue.1 , pp. 3-14
    • Paliwal, K.-K.1    Atal, B.-S.2
  • 17
    • 84901793334 scopus 로고    scopus 로고
    • Minimum Kullback-Leibler divergence parameter generation for HMM-based speech synthesis
    • Z.-H. Ling and L.-R. Dai, "Minimum Kullback-Leibler divergence parameter generation for HMM-based speech synthesis, " IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 5, pp. 1492-1502, 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.5 , pp. 1492-1502
    • Ling, Z.-H.1    Dai, L.-R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.