메뉴 건너뛰기




Volumn 19, Issue 5, 2011, Pages 1071-1079

Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis

Author keywords

F0 modeling; hidden Markov model (HMM) based synthesis; statistical parametric speech synthesis; voicing classification

Indexed keywords


EID: 85008023596     PISSN: 15587916     EISSN: 15587924     Source Type: Journal    
DOI: 10.1109/TASL.2010.2076805     Document Type: Article
Times cited : (123)

References (27)
  • 1
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis,” in Proc. Eurospeech, 1999, pp. 2347–2350.
    • (1999) Proc. Eurospeech , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 2
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • 4
    • H. Kawahara, I. M. Katsuse, and A. D. Cheveigne “Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds,” Speech Commun., vol. 27, no. 3–4, pp. 187–207, 1999.
    • (1999) Speech Commun. , vol.27 , Issue.3 , pp. 187-207
    • Kawahara, H.1    Katsuse, I.M.2    Cheveigne, A.D.3
  • 3
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
    • H. Kawahara, J. Estill, and O. Fujimura, “Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight,” in Proc. MAVEBA, 2001.
    • (2001) Proc. MAVEBA
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 4
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-based speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, “Speech parameter generation algorithms for HMM-based speech synthesis,” in Proc. ICASSP, 2000, pp. 1315–1318.
    • (2000) Proc. ICASSP , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 5
    • 0020596154 scopus 로고
    • Cepstral analysis synthesis on the mel frequency scale
    • S. Imai, “Cepstral analysis synthesis on the mel frequency scale,” in Proc. ICASSP, 1983, pp. 93–96.
    • (1983) Proc. ICASSP , pp. 93-96
    • Imai, S.1
  • 6
    • 84928118106 scopus 로고    scopus 로고
    • Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity
    • H. Kawahara, H. Katayose, A. D. Cheveigne, and R. D. Patterson, “Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity,” in Proc. Eurospeech, 1999, pp. 2781–2784.
    • (1999) Proc. Eurospeech , pp. 2781-2784
    • Kawahara, H.1    Katayose, H.2    Cheveigne, A.D.3    Patterson, R.D.4
  • 8
    • 21244491419 scopus 로고
    • A robust algorithm for pitch tracking (RAPT)
    • Amsterdam, The Netherlands: Elsevier
    • D. Talkin, “A robust algorithm for pitch tracking (RAPT),” in Speech Coding Synth., Amsterdam, The Netherlands: Elsevier, 1995, pp. 497–516.
    • (1995) Speech Coding Synth. , pp. 497-516
    • Talkin, D.1
  • 11
    • 0037567970 scopus 로고    scopus 로고
    • Pitch pattern generation using multi-space probability distribution HMM
    • T. Masuko, K. Tokuda, N. Miyazaki, and T. Kobayashi “Pitch pattern generation using multi-space probability distribution HMM,” IEICE Trans., vol. J83-D-II, no. 7, pp. 1600–1609, 2000.
    • (2000) IEICE Trans. , vol.J83-D-II , Issue.7 , pp. 1600-1609
    • Masuko, T.1    Tokuda, K.2    Miyazaki, N.3    Kobayashi, T.4
  • 12
    • 0023869369 scopus 로고
    • Lexical stress recognition using hidden Markov modeld
    • G. J. Freij and F. Fallside, “Lexical stress recognition using hidden Markov modeld,” in Proc. ICASSP, 1988, pp. 135–138.
    • (1988) Proc. ICASSP , pp. 135-138
    • Freij, G.J.1    Fallside, F.2
  • 13
    • 0028466266 scopus 로고
    • Modelling intonation contours at the phrase level using continuous density hidden Markov models
    • U. Jensen, R. K. Moore, P. Dalsgaard, and B. Lindberg “Modelling intonation contours at the phrase level using continuous density hidden Markov models,” Comput. Speech Lang., vol. 8, pp. 247–260, 1994.
    • (1994) Comput. Speech Lang. , vol.8 , pp. 247-260
    • Jensen, U.1    Moore, R.K.2    Dalsgaard, P.3    Lindberg, B.4
  • 14
    • 0032665603 scopus 로고    scopus 로고
    • A dynamical system model for generating fundamental frequency for speech synthesis
    • May
    • K. N. Ross and M. Ostendorf, “A dynamical system model for generating fundamental frequency for speech synthesis,” IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp. 295–309, May 1999.
    • (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.3 , pp. 295-309
    • Ross, K.N.1    Ostendorf, M.2
  • 17
    • 80051646062 scopus 로고    scopus 로고
    • A pitch pattern modeling technique using dynamic features on the border of voiced and unvoiced segments
    • H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “A pitch pattern modeling technique using dynamic features on the border of voiced and unvoiced segments,” Tech. Rep. IEICE, vol. 101, no. 325, pp. 53–58, 2001.
    • (2001) Tech. Rep. IEICE , vol.101 , Issue.325 , pp. 53-58
    • Zen, H.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 18
    • 0028996993 scopus 로고
    • Speech parameter generation from HMM using dynamic features
    • K. Tokuda, T. Kobayashi, and S. Imai, “Speech parameter generation from HMM using dynamic features,” in Proc. ICASSP, 1995, pp. 660–663.
    • (1995) Proc. ICASSP , pp. 660-663
    • Tokuda, K.1    Kobayashi, T.2    Imai, S.3
  • 19
    • 85135145174 scopus 로고    scopus 로고
    • Acoustic modeling based on the MDL principle for speech recognition
    • K. Shinoda and T. Watanabe, “Acoustic modeling based on the MDL principle for speech recognition,” in Proc. Eurospeech, 1997, pp. 99–102.
    • (1997) Proc. Eurospeech , pp. 99-102
    • Shinoda, K.1    Watanabe, T.2
  • 20
    • 0011450865 scopus 로고    scopus 로고
    • A study on pitch pattern generation using HMMs based on multi-space probability distributions
    • 12
    • N. Miyazaki, K. Tokuda, T. Masuko, and T. Kobayashi, “A study on pitch pattern generation using HMMs based on multi-space probability distributions,” Tech. Rep. IEICE, vol. SP98–12, 1998.
    • (1998) Tech. Rep. IEICE , vol.SP98
    • Miyazaki, N.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4
  • 21
    • 51749120945 scopus 로고
    • On the convergence of cubic interpolating splines
    • New York: Birkhauser
    • T. Lyche and L. L. Schumaker, “On the convergence of cubic interpolating splines,” in Spline Functions and Approximation Theory. New York: Birkhauser, 1973, pp. 169–189.
    • (1973) Spline Functions and Approximation Theory , pp. 169-189
    • Lyche, T.1    Schumaker, L.L.2
  • 22
    • 33646773080 scopus 로고    scopus 로고
    • CMU ARCTIC Databases for Speech Synthesis Lang
    • Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-LTI-03-177
    • J. Kominek and A. Black, CMU ARCTIC Databases for Speech Synthesis Lang. Technol. Inst., School of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, 2003, Tech. Rep. CMU-LTI-03-177.
    • (2003) Technol. Inst., School of Comput. Sci.
    • Kominek, J.1    Black, A.2
  • 23
    • 85008004838 scopus 로고    scopus 로고
    • [Online]. Available: http://hts.sp.nitech.ac.jp
    • HMM-Based Speech Synthesis System (HTS). [Online]. Available: http://hts.sp.nitech.ac.jp
  • 24
    • 33846405723 scopus 로고    scopus 로고
    • Details of the Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda “Details of the Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005,” IEICE Trans. Inf. Syst., vol. E90-D, no. 1, pp. 325–333, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.1 , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 27
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • T. Toda and K. Tokuda “A speech parameter generation algorithm considering global variance for HMM-based speech synthesis,” IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816–824, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.