메뉴 건너뛰기




Volumn , Issue , 2009, Pages 1787-1790

An improved minimum generation error based model adaptation for HMM-based speech synthesis

Author keywords

HMM; Linear regression; Minimum generation error; Speaker adaptation; Speech synthesis

Indexed keywords

EUCLIDEAN DISTANCE; HMM; HMM-BASED SPEECH SYNTHESIS; LISTENING TESTS; LOG SPECTRAL DISTORTIONS; MINIMUM GENERATION ERROR; MODEL ADAPTATION; MODEL TRAINING; REGRESSION MATRICES; SOURCE MODELS; SPEAKER ADAPTATION; SPECTRAL DISTORTIONS; SYNTHESIZED SPEECH;

EID: 70450183499     PISSN: None     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (6)

References (17)
  • 1
    • 0029725605 scopus 로고    scopus 로고
    • Speech synthesis from HMMs using dynamic features
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Speech synthesis from HMMs using dynamic features," in Proc. of ICASSP, pp. 389-392, 1996.
    • (1996) Proc. of ICASSP , pp. 389-392
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 2
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. of ICASSP, vol. 5, pp. 2347-2350, 1999.
    • (1999) Proc. of ICASSP , vol.5 , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 3
    • 53049106512 scopus 로고    scopus 로고
    • Speaker-independent HMM-based speech synthesis system - HTS-2007 system for the Blizzard Challenge 2007
    • J. Yamagishi, H. Zen, T. Toda, and K. Tokuda, "Speaker-independent HMM-based speech synthesis system - HTS-2007 system for the Blizzard Challenge 2007", in Blizzard Challenge 2007.
    • Blizzard Challenge 2007
    • Yamagishi, J.1    Zen, H.2    Toda, T.3    Tokuda, K.4
  • 4
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C.J. Leggetter and P.C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," in Computer Speech and Language, vol.9, no.2, pp. 171-185, 1995.
    • (1995) Computer Speech and Language , vol.9 , Issue.2 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 5
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M.J.F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," in Computer Speech and Language, vol. 12, no. 2, pp. 75-98, 1998.
    • (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 7
    • 33947669452 scopus 로고    scopus 로고
    • HSMM-based model adaptation algorithms for average-voice-based speech synthesis
    • May
    • J. Yamagishi, K. Ogata, Y. Nakano, J. Isogai, and T. Kobayashi, "HSMM-based model adaptation algorithms for average-voice-based speech synthesis," in Proc. of ICASSP, pp. 77-80, May 2006.
    • (2006) Proc. of ICASSP , pp. 77-80
    • Yamagishi, J.1    Ogata, K.2    Nakano, Y.3    Isogai, J.4    Kobayashi, T.5
  • 8
    • 0142007308 scopus 로고    scopus 로고
    • A training method of average voice model for HMM-based speech synthesis
    • J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "A training method of average voice model for HMM-based speech synthesis," in IEICE Trans. of Fundamentals, vol. E86-A, no. 8, pp. 1956-1963, 2003.
    • (2003) IEICE Trans. of Fundamentals , vol.E86-A , Issue.8 , pp. 1956-1963
    • Yamagishi, J.1    Tamura, M.2    Masuko, T.3    Tokuda, K.4    Kobayashi, T.5
  • 9
    • 33846429403 scopus 로고    scopus 로고
    • Minimum generation error training for HMM-based speech synthesis
    • Y.-J.Wu and R.H.Wang, "Minimum generation error training for HMM-based speech synthesis," in Proc. of ICASSP, vol. 1, pp. 889-892, 2006.
    • (2006) Proc. of ICASSP , vol.1 , pp. 889-892
    • Wu, Y.J.1    Wang, R.H.2
  • 10
    • 84867214032 scopus 로고    scopus 로고
    • Minimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis
    • Y.-J. Wu and K. Tokuda, "Minimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis," in Proc. of Interspeech, pp. 577-580, 2008.
    • (2008) Proc. of Interspeech , pp. 577-580
    • Wu, Y.-J.1    Tokuda, K.2
  • 11
    • 0001810975 scopus 로고
    • Line spectrum representation of linear predictive coefficients of speech signals
    • a, p, A
    • F. Itakura, "Line spectrum representation of linear predictive coefficients of speech signals," in J. Acoust. Soc. Amer., 1975, vol. 57, p. 535(a), p. s35(A).
    • (1975) J. Acoust. Soc. Amer , vol.57
    • Itakura, F.1
  • 12
    • 51449098031 scopus 로고    scopus 로고
    • Minimum generation error lineal regression based model adaptation for HMM-based speech synthesis
    • Mar
    • L. Qin, Y.-J. Wu, Z.-H. Ling, R.-H. Wang, and L.-R. Dai, "Minimum generation error lineal regression based model adaptation for HMM-based speech synthesis," in Proc. of ICASSP, pp. 3953-3956, Mar. 2008.
    • (2008) Proc. of ICASSP , pp. 3953-3956
    • Qin, L.1    Wu, Y.-J.2    Ling, Z.-H.3    Wang, R.-H.4    Dai, L.-R.5
  • 13
    • 0000920843 scopus 로고
    • A theory of adaptive pattern classifiers
    • S. Amari, "A theory of adaptive pattern classifiers," IEEE Trans. Electron. Comput., vol. EC-16, no. 3, pp. 299-307, 1967.
    • (1967) IEEE Trans. Electron. Comput , vol.EC-16 , Issue.3 , pp. 299-307
    • Amari, S.1
  • 14
    • 70450163188 scopus 로고    scopus 로고
    • J. Kominek and A. Black, The CMU ARCTIC speech databases for speech synthesis research, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, Tech. Rep. CMULTI-03-177, http://festvox.org/cmu arctic/, 2003.
    • J. Kominek and A. Black, "The CMU ARCTIC speech databases for speech synthesis research," Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, Tech. Rep. CMULTI-03-177, http://festvox.org/cmu arctic/, 2003.
  • 15
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using pitch-adaptive time-frequency smoothing and an instanta-neous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse and A. deCheveigne, "Restructuring speech representations using pitch-adaptive time-frequency smoothing and an instanta-neous-frequency-based F0 extraction: possible role of a repetitive structure in sounds," in Speech Communication, vol. 27, pp. 187-207, 1999.
    • (1999) Speech Communication , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    deCheveigne, A.3
  • 16
    • 0032678076 scopus 로고    scopus 로고
    • Hidden markov models based on multi-space probability distribution for pitch pattern modeling
    • K. Tokuda, T. Masuko, N. Miyazaki, and T. Kobayashi, "Hidden markov models based on multi-space probability distribution for pitch pattern modeling," in Proc. of ICASSP, pp. 229-232, 1999.
    • (1999) Proc. of ICASSP , pp. 229-232
    • Tokuda, K.1    Masuko, T.2    Miyazaki, N.3    Kobayashi, T.4
  • 17
    • 44949163704 scopus 로고    scopus 로고
    • Improving the performance of HMM-Based voice conversion using context clustering decision tree and appropriate regression matrix format
    • L. Qin, Y.-J. Wu, Z.H. Ling and R.H. Wang, "Improving the performance of HMM-Based voice conversion using context clustering decision tree and appropriate regression matrix format," in Proc. of Interspeech, pp. 2250-2253, 2006.
    • (2006) Proc. of Interspeech , pp. 2250-2253
    • Qin, L.1    Wu, Y.-J.2    Ling, Z.H.3    Wang, R.H.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.