메뉴 건너뛰기




Volumn E90-D, Issue 5, 2007, Pages 825-834

A hidden semi-Markov model-based speech synthesis system

Author keywords

Hidden Markov model; Hidden semi Markov model; HMM based speech synthesis

Indexed keywords

MARKOV PROCESSES; MAXIMUM LIKELIHOOD; PROBABILITY DENSITY FUNCTION; SPEECH; SPEECH SYNTHESIS;

EID: 44449177634     PISSN: 09168532     EISSN: 17451361     Source Type: Journal    
DOI: 10.1093/ietisy/e90-d.5.825     Document Type: Article
Times cited : (204)

References (36)
  • 1
    • 0029725605 scopus 로고    scopus 로고
    • Speech synthesis from HMMs using dynamic features
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Speech synthesis from HMMs using dynamic features," Proc. ICASSP, pp. 389-392, 1996.
    • (1996) Proc. ICASSP , pp. 389-392
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 2
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," Proc. Eurospeech, pp. 2347-2350, 1999.
    • (1999) Proc. Eurospeech , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 3
    • 0030696416 scopus 로고    scopus 로고
    • Voice characteristics conversion for HMM-based speech synthesis system
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Voice characteristics conversion for HMM-based speech synthesis system," Proc. ICASSP, pp. 1611-1614, 1997.
    • (1997) Proc. ICASSP , pp. 1611-1614
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 4
    • 0034842740 scopus 로고    scopus 로고
    • Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
    • M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR," Proc. ICASSP, pp. 805-808, 2001.
    • (2001) Proc. ICASSP , pp. 805-808
    • Tamura, M.1    Masuko, T.2    Tokuda, K.3    Kobayashi, T.4
  • 7
    • 0005897608 scopus 로고
    • Linguistic properties in the control of segmental duration for speech synthesis
    • ed. G. Bailly and C. Benoit, pp, Elsevier Science Publishers
    • N. Kaiki, K. Takeda, and Y. Sagisaka, "Linguistic properties in the control of segmental duration for speech synthesis," in Talking Machines: Theories, Models, and Designs, ed. G. Bailly and C. Benoit, pp. 255-263, Elsevier Science Publishers, 1992.
    • (1992) Talking Machines: Theories, Models, and Designs , pp. 255-263
    • Kaiki, N.1    Takeda, K.2    Sagisaka, Y.3
  • 8
    • 0002069313 scopus 로고
    • Tree-based modelling of segmental duration
    • ed. G. Bailly and C. Benoit, pp, Elsevier Science Publishers
    • M. Riley, "Tree-based modelling of segmental duration," in Talking Machines: Theories, Models, and Designs, ed. G. Bailly and C. Benoit, pp. 265-273, Elsevier Science Publishers, 1992.
    • (1992) Talking Machines: Theories, Models, and Designs , pp. 265-273
    • Riley, M.1
  • 9
    • 0034226722 scopus 로고    scopus 로고
    • Statistical modelling of speech segment duration by constrained tree regression
    • July
    • N. Iwahashi and Y. Sagisaka, "Statistical modelling of speech segment duration by constrained tree regression," IEICE Trans. Inf. & Syst., vol. E83-D, no. 7, pp. 1550-1559, July 2000.
    • (2000) IEICE Trans. Inf. & Syst , vol.E83-D , Issue.7 , pp. 1550-1559
    • Iwahashi, N.1    Sagisaka, Y.2
  • 12
    • 33846442604 scopus 로고    scopus 로고
    • Investigation of state duration model based on gamma distribution for HMM-based speech synthesis,
    • SP2001-81, 2001
    • Y. Ishimatsu, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Investigation of state duration model based on gamma distribution for HMM-based speech synthesis," IEICE Technical Report, SP2001-81, 2001.
    • IEICE Technical Report
    • Ishimatsu, Y.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 13
    • 68249143901 scopus 로고    scopus 로고
    • A study on state duration modeling using lognormal distribution for HMM-based speech synthesis
    • March
    • J. Yamagishi, T. Masuko, and Kobayashi, "A study on state duration modeling using lognormal distribution for HMM-based speech synthesis," Proc. ASJ, pp. 225-226, March 2004.
    • (2004) Proc. ASJ , pp. 225-226
    • Yamagishi, J.1    Masuko, T.2    Kobayashi3
  • 14
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," Journal of Royal Statistics Society, vol. 39, pp. 1-38, 1977.
    • (1977) Journal of Royal Statistics Society , vol.39 , pp. 1-38
    • Dempster, A.1    Laird, N.2    Rubin, D.3
  • 16
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-based speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," Proc. ICASSP, pp. 1315-1318, 2000.
    • (2000) Proc. ICASSP , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 17
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • L. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-285, 1989.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-285
    • Rabiner, L.1
  • 20
    • 0022234383 scopus 로고
    • Explicit modeling of state occupancy in hidden Markov models for automatic speech recognition
    • M. Russell and R. Moore, "Explicit modeling of state occupancy in hidden Markov models for automatic speech recognition," Proc. ICASSP, pp. 5-8, 1985.
    • (1985) Proc. ICASSP , pp. 5-8
    • Russell, M.1    Moore, R.2
  • 21
    • 0022685753 scopus 로고
    • Continuously variable duration hidden Markov models for automatic speech recognition
    • S. Levinson, "Continuously variable duration hidden Markov models for automatic speech recognition," Comput. Speech Lang., vol. 1, pp. 29-45, 1986.
    • (1986) Comput. Speech Lang , vol.1 , pp. 29-45
    • Levinson, S.1
  • 24
    • 0025419316 scopus 로고
    • Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition
    • K. F. Lee, "Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition," IEEE Trans. Acoust. Speech Signal Process., vol. 38, no. 4, pp. 599-609, 1990.
    • (1990) IEEE Trans. Acoust. Speech Signal Process , vol.38 , Issue.4 , pp. 599-609
    • Lee, K.F.1
  • 25
    • 0030715097 scopus 로고    scopus 로고
    • HMM topology design using maximum likelihood successive state splitting
    • M. Ostendorf and H. Singer, "HMM topology design using maximum likelihood successive state splitting," Comput. Speech Lang., vol. 11, no. 1, pp. 17-41, 1997.
    • (1997) Comput. Speech Lang , vol.11 , Issue.1 , pp. 17-41
    • Ostendorf, M.1    Singer, H.2
  • 26
    • 0027153655 scopus 로고
    • Predicting unseen triphones with senones
    • M. Y. Hwang, X. Huang, and F. Alleva, "Predicting unseen triphones with senones," Proc. ICASSP, pp. 311-314, 1993.
    • (1993) Proc. ICASSP , pp. 311-314
    • Hwang, M.Y.1    Huang, X.2    Alleva, F.3
  • 28
    • 85027177249 scopus 로고    scopus 로고
    • T. Masuko, K. Tokuda, N. Miyazaki, and T. Kobayashi, Pitch pattern generation using multi-space probability distribution HMM, IEICE Trans. Inf. & Syst. (Japanese Edition), J85-D-II, no. 7, pp. 1600-1609, July 2000.
    • T. Masuko, K. Tokuda, N. Miyazaki, and T. Kobayashi, "Pitch pattern generation using multi-space probability distribution HMM," IEICE Trans. Inf. & Syst. (Japanese Edition), vol. J85-D-II, no. 7, pp. 1600-1609, July 2000.
  • 31
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
    • H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight," Proc. MAVEBA, pp. 13-15, 2001.
    • (2001) Proc. MAVEBA , pp. 13-15
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 32
    • 33745200051 scopus 로고    scopus 로고
    • Speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • Eurospeech, pp
    • T. Toda and K. Tokuda, "Speech parameter generation algorithm considering global variance for HMM-based speech synthesis," Proc. Interspeech (Eurospeech), pp. 2801-2804, 2005.
    • (2005) Proc. Interspeech , pp. 2801-2804
    • Toda, T.1    Tokuda, K.2
  • 33
    • 33745215669 scopus 로고    scopus 로고
    • An overview of Nitech HMM-based speech synthesis system for Blizzard Challenge 2005
    • H. Zen and T. Toda, "An overview of Nitech HMM-based speech synthesis system for Blizzard Challenge 2005," Proc. Interspeech, pp. 93-96, 2005.
    • (2005) Proc. Interspeech , pp. 93-96
    • Zen, H.1    Toda, T.2
  • 35
    • 85135145174 scopus 로고    scopus 로고
    • Acoustic modeling based on the MDL criterion for speech recognition
    • K. Shinoda and T. Watanabe, "Acoustic modeling based on the MDL criterion for speech recognition," Proc. Eurospeech, pp. 99-102, 1997.
    • (1997) Proc. Eurospeech , pp. 99-102
    • Shinoda, K.1    Watanabe, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.