메뉴 건너뛰기




Volumn , Issue , 2013, Pages 7825-7829

Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis

Author keywords

hidden Markov model; restricted Boltzmann machine; spectral envelope; Speech synthesis

Indexed keywords

CONVENTIONAL METHODS; GAUSSIAN MIXTURE MODEL; GENERALIZATION ABILITY; HMM-BASED SPEECH SYNTHESIS; JOINT DISTRIBUTIONS; RESTRICTED BOLTZMANN MACHINE; SPECTRAL ENVELOPES; STATISTICAL PARAMETRIC SPEECH SYNTHESIS;

EID: 84890447002     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2013.6639187     Document Type: Conference Paper
Times cited : (43)

References (15)
  • 1
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Eurospeech, 1999, pp. 2347-2350.
    • (1999) Eurospeech , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 2
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMMbased speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMMbased speech synthesis," in ICASSP, 2000, vol. 3, pp. 1315-1318.
    • (2000) ICASSP , vol.3 , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 3
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds," Speech Communication, vol. 27, pp. 187-207, 1999.
    • (1999) Speech Communication , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigne, A.3
  • 4
    • 33846405723 scopus 로고    scopus 로고
    • Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005," IEICE Trans. Inf. &Syst., vol. E90-D, no. 1, pp. 325-333, 2007.
    • (2007) IEICE Trans. Inf. &Syst. , vol.E90-D , Issue.1 , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 6
    • 0000329993 scopus 로고
    • Information processing in dynamical systems: Foundations of harmony theory
    • D.E. Rumelhart and McClelland J.L., Eds., chapter 6MIT Press
    • P. Smolensky, "Information processing in dynamical systems: Foundations of harmony theory," in Parallel Distributed Processing, D.E. Rumelhart and McClelland J.L., Eds., vol. 1, chapter 6, pp. 194-281. MIT Press, 1986.
    • (1986) Parallel Distributed Processing , vol.1 , pp. 194-281
    • Smolensky, P.1
  • 8
    • 0013344078 scopus 로고    scopus 로고
    • Training products of experts by minimizing contrastive divergence
    • G.E Hinton, "Training products of experts by minimizing contrastive divergence," Neural Computation, vol. 14, no. 8, pp. 1711-1800, 2002.
    • (2002) Neural Computation , vol.14 , Issue.8 , pp. 1711-1800
    • Hinton, G.E.1
  • 9
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • G.E. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, vol. 313, no. 5786, pp. 504-507, 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.2
  • 11
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • G.E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Speech Audio Process., vol. 20, no. 1, pp. 30-42, 2012.
    • (2012) IEEE Trans. Speech Audio Process. , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 13
    • 79959842828 scopus 로고    scopus 로고
    • Binary coding of speech spectrograms using a deep auto-encoder
    • L. Deng, M. Seltzer, D. Yu, A. Acero, A. Mohamed, and G.E. Hinton, "Binary coding of speech spectrograms using a deep auto-encoder," in Interspeech, 2010, pp. 1692-1695.
    • (2010) Interspeech , pp. 1692-1695
    • Deng, L.1    Seltzer, M.2    Yu, D.3    Acero, A.4    Mohamed, A.5    Hinton, G.E.6
  • 15
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G.E Hinton, S. Osindero, and Y.W. Teh, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, no. 7, pp. 1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.W.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.