메뉴 건너뛰기




Volumn 8, Issue 2, 2014, Pages 239-250

Parameter generation methods with rich context models for high-quality and flexible text-to-speech Synthesis

Author keywords

GMM; HMM based speech synthesis; over smoothing; parameter generation; rich context model

Indexed keywords

CONTEXT MODELING; GMM; HMM-BASED SPEECH SYNTHESIS; OVER-SMOOTHING; PARAMETER GENERATION;

EID: 84897862522     PISSN: 19324553     EISSN: None     Source Type: Journal    
DOI: 10.1109/JSTSP.2013.2288599     Document Type: Article
Times cited : (15)

References (28)
  • 2
    • 0027699809 scopus 로고
    • Speech segment selection for concatenative synthesis based on spectral distortion minimization
    • N. Iwahashi, N. Kaiki, and Y. Sagisaka, "Speech segment selection for concatenative synthesis based on spectral distortion minimization," IEICE Trans., Fundamentals, vol. E76-A, no. 11, pp. 1942-1948, 1993
    • (1993) IEICE Trans., Fundamentals , vol.76 , Issue.11 , pp. 1942-1948
    • Iwahashi, N.1    Kaiki, N.2    Sagisaka, Y.3
  • 3
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • May
    • A. J.Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP, Atlanta, GA, USA, May 1996, pp. 373-376
    • (1996) Proc. ICASSP, Atlanta, GA, USA , pp. 373-376
    • Hunt, A.J.1    Black, A.2
  • 5
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis," Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009
    • (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3
  • 7
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
    • DOI 10.1093/ietisy/e90-d.2.533
    • J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans., Inf. Syst., vol. E90-D, no. 2, pp. 533-543, 2007 (Pubitemid 46279829)
    • (2007) IEICE Transactions on Information and Systems , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 8
    • 51449114529 scopus 로고    scopus 로고
    • A style control technique forHMM-based expressive speech synthesis
    • T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style control technique forHMM-based expressive speech synthesis," IEICE Trans., Inf. Syst., vol. E90-D, no. 9, pp. 1406-1413, 2007
    • (2007) IEICE Trans., Inf. Syst , vol.90 , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 9
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans., vol. E90-D, no. 5, pp. 816-824, 2007
    • (2007) IEICE Trans , vol.90 , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 12
    • 70450161678 scopus 로고    scopus 로고
    • Rich context modeling for high quality HMM-based TTS
    • Sep
    • Z. Yan, Q. Yao, and S. K. Frank, "Rich context modeling for high quality HMM-based TTS," in Proc. INTERSPEECH, Brighton, U.K., Sep. 2009, pp. 1755-1758
    • (2009) Proc. INTERSPEECH, Brighton, U.K , pp. 1755-1758
    • Yan, Z.1    Yao, Q.2    Frank, S.K.3
  • 14
    • 4544270859 scopus 로고    scopus 로고
    • Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis
    • May
    • T. Toda, H. Kawai, and M. Tsuzaki, "Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis," in Proc. ICASSP,Montreal,QC, Canada, May 2004, pp. 657-660
    • (2004) Proc. ICASSP,Montreal,QC, Canada , pp. 657-660
    • Toda, T.1    Kawai, H.2    Tsuzaki, M.3
  • 20
  • 21
    • 29144484191 scopus 로고    scopus 로고
    • Concatenative speech synthesis based on the plural unit selection and fusion method
    • DOI 10.1093/ietisy/e88-d.11.2565
    • T.Mizutani and T. Kagoshima, "Concatenative speech synthesis based on the plural unit selection and fusion method," IEICE Trans. Inf. Syst., vol. E88-D, no. 11, pp. 2565-2572, 2005 (Pubitemid 41816802)
    • (2005) IEICE Transactions on Information and Systems , vol.E88-D , Issue.11 , pp. 2565-2572
    • Mizutani, T.1    Kagoshima, T.2
  • 22
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995
    • (1995) Comput. Speech Lang , vol.9 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 23
    • 35549000218 scopus 로고    scopus 로고
    • Cross-validation and aggregated EM training for robust parameter estimation
    • DOI 10.1016/j.csl.2007.07.005, PII S0885230807000472
    • T. Shinozaki and M. Ostendorf, "Cross-validation and aggregated EM training for robust parameter estimation," Comput. Speech Lang., vol. 22, pp. 185-195, 2008 (Pubitemid 350016715)
    • (2008) Computer Speech and Language , vol.22 , Issue.2 , pp. 185-195
    • Shinozaki, T.1    Ostendorf, M.2
  • 24
    • 44449177634 scopus 로고    scopus 로고
    • Hidden semimarkovmodel based speech synthesis system
    • H. Zen, K. Tokuda, T. K. T. Masuko, and T. Kitamura, "Hidden semimarkovmodel based speech synthesis system," IEICE Trans., Inf. Syst., vol. E90-D, no. 5, pp. 825-834, 2007
    • (2007) IEICE Trans., Inf. Syst , vol.90 , Issue.5 , pp. 825-834
    • Zen, H.1    Tokuda, K.2    Masuko, T.K.T.3    Kitamura, T.4
  • 26
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
    • Sep
    • H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT"," in Proc. MAVEBA ' 01, Florence, Italy, Sep. 2001, pp. 1-6
    • (2001) Proc. MAVEBA ' 01, Florence, Italy , pp. 1-6
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 27
  • 28
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
    • (1999) Speech Commun , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigne, A.D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.