메뉴 건너뛰기




Volumn , Issue , 2014, Pages 1954-1958

DNN-based stochastic postfilter for HMM-based speech synthesis

Author keywords

DNN; HMM; Modulation spectrum; Postfilter; Segmental quality; Speech synthesis

Indexed keywords

MODULATION; PLASMA DIAGNOSTICS; SPEECH SYNTHESIS; STOCHASTIC SYSTEMS;

EID: 84910100893     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (30)

References (20)
  • 1
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Comm., vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Comm. , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 2
    • 67650851754 scopus 로고    scopus 로고
    • Ustc system for blizzard challenge 2006 an improved hmm-based speech synthesis method
    • Pittsburgh, USA, September
    • Z. Ling, Y.Wu, Y.Wang, L. Qin, and R.Wang, "USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method, " in Proc. Blizzard Challenge Workshop, Pittsburgh, USA, September 2006.
    • (2006) Proc. Blizzard Challenge Workshop
    • Ling, Z.1    Wu, Y.2    Wang, Y.3    Qin, L.4    Wang, R.5
  • 3
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for hmm-based speech synthesis
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816-824, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 4
    • 84905234422 scopus 로고    scopus 로고
    • A post filter to modify the modulation spectrum in hmm-based speech synthesis
    • May
    • S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "A post filter to modify the modulation spectrum in HMM-based speech synthesis, " in Proc. ICASSP, May. 2014.
    • (2014) Proc. ICASSP
    • Takamichi, S.1    Toda, T.2    Neubig, G.3    Sakti, S.4    Nakamura, S.5
  • 5
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • nov
    • T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " IEEE Trans. on Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2222 -2235, nov. 2007.
    • (2007) IEEE Trans. on Audio, Speech and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.2    Tokuda, K.3
  • 6
    • 84910104946 scopus 로고    scopus 로고
    • Voice conversion using deep neural networks with multiple frame spectral envelopes
    • Submitted to
    • L.-H. Chen, Z.-H. Ling, and L.-R. Dai, "Voice conversion using deep neural networks with multiple frame spectral envelopes, " Submitted to Inter speech, 2014.
    • (2014) Inter Speech
    • Chen, L.-H.1    Ling, Z.-H.2    Dai, L.-R.3
  • 7
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-to speech synthesis
    • A. Kain and M. Macon, "Spectral voice conversion for text-to speech synthesis, " in Proc. ICASSP, 1998, pp. 285-288.
    • (1998) Proc. ICASSP , pp. 285-288
    • Kain, A.1    Macon, M.2
  • 8
    • 0000329993 scopus 로고
    • Information processing in dynamical systems: Foundations of harmony theory
    • D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA, USA: MIT Press, ch. 6
    • P. Smolensky, "Information processing in dynamical systems: foundations of harmony theory, " in Parallel distributed processing: explorations in the microstructure of cognition, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA, USA: MIT Press, 1986, vol. 1, ch. 6, pp. 194-281.
    • (1986) Parallel Distributed Processing: Explorations in the Microstructure of Cognition , vol.1 , pp. 194-281
    • Smolensky, P.1
  • 10
    • 84905223323 scopus 로고    scopus 로고
    • Using bidirectional associative memories for joint spectral envelope modeling in voice conversion
    • May
    • L.-J. Liu, L.-H. Chen, Z.-H. Ling, and L.-R. Dai, "Using bidirectional associative memories for joint spectral envelope modeling in voice conversion, " in Proc. ICASSP, May. 2014.
    • (2014) Proc. ICASSP
    • Liu, L.-J.1    Chen, L.-H.2    Ling, Z.-H.3    Dai, L.-R.4
  • 11
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks, " Science, vol. 313, no. 5786, pp. 504-507, 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.R.2
  • 12
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis, " IEEE Trans. on Audio, Speech and Language Processing, vol. 21, no. 10, pp. 2129-2139, 2013.
    • (2013) IEEE Trans. on Audio, Speech and Language Processing , vol.21 , Issue.10 , pp. 2129-2139
    • Ling, Z.-H.1    Deng, L.2    Yu, D.3
  • 14
    • 85016140477 scopus 로고
    • An adaptive algorithm for mel-cepstral analysis of speech
    • San Francisco, USA, March
    • T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for mel-cepstral analysis of speech, " in Proc. ICASSP, vol. 1, San Francisco, USA, March 1992, pp. 137-140.
    • (1992) Proc. ICASSP , vol.1 , pp. 137-140
    • Fukada, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 15
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. Cheveigńe, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds, " Speech Comm., vol. 27, pp. 187-207, 1999.
    • (1999) Speech Comm. , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigńe, A.3
  • 16
    • 0013344078 scopus 로고    scopus 로고
    • Training products of experts by minimizing contrastive divergence
    • G. Hinton, "Training products of experts by minimizing contrastive divergence, " Neural Computation, vol. 12, no. 14, pp. 1711-1800, 2002.
    • (2002) Neural Computation , vol.12 , Issue.14 , pp. 1711-1800
    • Hinton, G.1
  • 17
    • 84872506495 scopus 로고    scopus 로고
    • A practical guide to training restricted boltzmann machines
    • Springer
    • G. E. Hinton, "A practical guide to training restricted Boltzmann machines, " in Neural Networks: Tricks of the Trade. Springer, 2012, pp. 599-619.
    • (2012) Neural Networks: Tricks of the Trade. , pp. 599-619
    • Hinton, G.E.1
  • 18
    • 0014568991 scopus 로고
    • Ieee recommended practice for speech quality measurement
    • IEEE, "IEEE recommended practice for speech quality measurement, " IEEE Trans. on Audio and Electro acoustics, vol. 17, no. 3, pp. 225-246, 1969.
    • (1969) IEEE Trans. on Audio and Electro Acoustics , vol.17 , Issue.3 , pp. 225-246
    • IEEE1
  • 19
    • 79959847301 scopus 로고    scopus 로고
    • Global variance modeling on the log power spectrum of lsps for hmm-based speech synthesis
    • Z.-H. Ling, Y. Hu, and L.-R. Dai, "Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis." in Proc. Inter speech, 2010, pp. 825-828.
    • (2010) Proc. Inter Speech , pp. 825-828
    • Ling, Z.-H.1    Hu, Y.2    Dai, L.-R.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.