메뉴 건너뛰기




Volumn 52, Issue 5, 2010, Pages 394-404

Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech

Author keywords

Emotional speech synthesis; HMM based synthesis; Unit selection

Indexed keywords

CONTEXT DEPENDENT; EMOTIONAL SPEECH; EMOTIONAL SPEECH SYNTHESIS; IDENTIFICATION RATES; PERCEPTUAL TEST; PROSODIC MODELING; SPECTRAL MODELING; SPEECH QUALITY; SYNTHETIC SPEECH; TWO-STATE; UNIT SELECTION; UNIT-SELECTION SPEECH SYNTHESIS;

EID: 77949913458     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2009.12.007     Document Type: Article
Times cited : (63)

References (44)
  • 1
    • 77949915957 scopus 로고    scopus 로고
    • Generacion de una voz sintetica en Castellano basada en HSMM para la Evaluacion Albayzin 2008: conversion texto a voz
    • (in Spanish)
    • Barra-Chicote R., Yamagishi J., Montero J., King S., Lutfi S., and Macias-Guarasa J. Generacion de una voz sintetica en Castellano basada en HSMM para la Evaluacion Albayzin 2008: conversion texto a voz. In V J. Tecnol. Habla (2008) 115-118 (in Spanish)
    • (2008) In V J. Tecnol. Habla , pp. 115-118
    • Barra-Chicote, R.1    Yamagishi, J.2    Montero, J.3    King, S.4    Lutfi, S.5    Macias-Guarasa, J.6
  • 6
    • 85006631929 scopus 로고    scopus 로고
    • Unit selection and emotional speech
    • Black, A.W., 2003. Unit selection and emotional speech. In: Proc. EUROSPEECH 2003, pp. 1649-1652.
    • (2003) Proc. EUROSPEECH 2003 , pp. 1649-1652
    • Black, A.W.1
  • 7
    • 84966398940 scopus 로고
    • Optimising selection of units from speech database for concatenative synthesis
    • Black, A.W., Cambpbell, N., 1995. Optimising selection of units from speech database for concatenative synthesis. In: Proc. EUROSPEECH-95, pp. 581-584.
    • (1995) Proc. EUROSPEECH-95 , pp. 581-584
    • Black, A.W.1    Cambpbell, N.2
  • 8
    • 33745216749 scopus 로고    scopus 로고
    • The Blizzard Challenge - 2005: Evaluating corpus-based speech synthesis on common datasets
    • Black, A.W., Tokuda, K., 2005. The Blizzard Challenge - 2005: evaluating corpus-based speech synthesis on common datasets. In: Proc. EUROSPEECH 2005, pp.77-80.
    • (2005) Proc. EUROSPEECH 2005 , pp. 77-80
    • Black, A.W.1    Tokuda, K.2
  • 9
    • 85009247888 scopus 로고    scopus 로고
    • Expressive speech synthesis using a concatenative synthesizer
    • Bulut, M., Narayan, S., Syrdal, A., 2002. Expressive speech synthesis using a concatenative synthesizer. In: Proc. ICSLP 2002, pp. 1265-1268.
    • (2002) Proc. ICSLP 2002 , pp. 1265-1268
    • Bulut, M.1    Narayan, S.2    Syrdal, A.3
  • 14
    • 34047123652 scopus 로고    scopus 로고
    • Multisyn: open-domain unit selection for the festival speech synthesis system
    • Clark R.A., Richmond K., and King S. Multisyn: open-domain unit selection for the festival speech synthesis system. Speech Comm. 49 4 (2007) 317-330
    • (2007) Speech Comm. , vol.49 , Issue.4 , pp. 317-330
    • Clark, R.A.1    Richmond, K.2    King, S.3
  • 15
    • 0032651722 scopus 로고    scopus 로고
    • A hidden Markov-model-based trainable speech synthesizer
    • Donovan R., and Woodland P. A hidden Markov-model-based trainable speech synthesizer. Comput. Speech Lang. 13 3 (1999) 223-241
    • (1999) Comput. Speech Lang. , vol.13 , Issue.3 , pp. 223-241
    • Donovan, R.1    Woodland, P.2
  • 16
    • 84966356293 scopus 로고    scopus 로고
    • Preservation, identification, and use of emotion in a text-to-speech system
    • Eide, E., 2002. Preservation, identification, and use of emotion in a text-to-speech system. In: Proc. IEEE Workshop on Speech Synthesis, pp. 127-130.
    • (2002) Proc. IEEE Workshop on Speech Synthesis , pp. 127-130
    • Eide, E.1
  • 19
    • 35048829796 scopus 로고    scopus 로고
    • Gomes, C., Sellmann, M., Es, C.V., Es, H.V., 2004. The challenge of generating spatially balanced scientific experiment designs. In: CP-AI-OR'04, pp. 387-394.
    • Gomes, C., Sellmann, M., Es, C.V., Es, H.V., 2004. The challenge of generating spatially balanced scientific experiment designs. In: CP-AI-OR'04, pp. 387-394.
  • 21
    • 33745202619 scopus 로고    scopus 로고
    • Informed blending of databases for emotional speech synthesis
    • Hofer, G., Richmond, K., Clark, R., 2005. Informed blending of databases for emotional speech synthesis. In: Proc. Interspeech 2005, pp. 501-504.
    • (2005) Proc. Interspeech , pp. 501-504
    • Hofer, G.1    Richmond, K.2    Clark, R.3
  • 22
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • Hunt, A., Black, A.W., 1996. Unit selection in a concatenative speech synthesis system using a large speech database. In: Proc. ICASSP-96, pp. 373-376.
    • (1996) Proc. ICASSP-96 , pp. 373-376
    • Hunt, A.1    Black, A.W.2
  • 24
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds
    • Kawahara H., Masuda-Katsuse I., and Cheveigné A. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds. Speech Comm. 27 (1999) 187-207
    • (1999) Speech Comm. , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigné, A.3
  • 26
    • 0025543906 scopus 로고
    • Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • Moulines E., and Charpentier F. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Comm. 9 5-6 (1990) 453-468
    • (1990) Speech Comm. , vol.9 , Issue.5-6 , pp. 453-468
    • Moulines, E.1    Charpentier, F.2
  • 27
    • 51449114529 scopus 로고    scopus 로고
    • A style control technique for HMM-based expressive speech synthesis
    • Nose T., Yamagishi J., and Kobayashi T. A style control technique for HMM-based expressive speech synthesis. IEICE Trans. Inf. Systems E 90-D 9 (2007) 1406-1413
    • (2007) IEICE Trans. Inf. Systems E , vol.90 -D , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Kobayashi, T.3
  • 29
    • 0037384712 scopus 로고    scopus 로고
    • Vocal communication of emotion: a review of research paradigms
    • Scherer K.R. Vocal communication of emotion: a review of research paradigms. Speech Comm. 40 1-2 (2003) 227-256
    • (2003) Speech Comm. , vol.40 , Issue.1-2 , pp. 227-256
    • Scherer, K.R.1
  • 30
    • 84971539709 scopus 로고    scopus 로고
    • Emotional speech synthesis: A review
    • Schröder, M., 2001. Emotional speech synthesis: a review. In: Proc. EUROSPEECH 2001, pp. 561-564.
    • (2001) Proc. EUROSPEECH 2001 , pp. 561-564
    • Schröder, M.1
  • 32
    • 84867199052 scopus 로고    scopus 로고
    • Investigating Festival's target cost function using perceptual experiments
    • Strom, V., King, S., 2008. Investigating Festival's target cost function using perceptual experiments. In: Proc. Interspeech 2008, pp. 1873-1876.
    • (2008) Proc. Interspeech , pp. 1873-1876
    • Strom, V.1    King, S.2
  • 34
    • 29144475179 scopus 로고    scopus 로고
    • Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing
    • Tachibana M., Yamagishi J., Masuko T., and Kobayashi T. Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing. IEICE Trans. Inf. Systems E 88-D 11 (2005) 2484-2491
    • (2005) IEICE Trans. Inf. Systems E , vol.88 -D , Issue.11 , pp. 2484-2491
    • Tachibana, M.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 35
    • 33645768204 scopus 로고    scopus 로고
    • A style adaptation technique for speech synthesis using HSMM and suprasegmental features
    • Tachibana M., Yamagishi J., Masuko T., and Kobayashi T. A style adaptation technique for speech synthesis using HSMM and suprasegmental features. IEICE Trans. Inf. Systems E 89-D 3 (2006) 1092-1099
    • (2006) IEICE Trans. Inf. Systems E , vol.89 -D , Issue.3 , pp. 1092-1099
    • Tachibana, M.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 36
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • Toda T., and Tokuda K. A speech parameter generation algorithm considering global variance for HMM-based speech synthesis. IEICE Trans. Inf. Systems E 90-D 5 (2007) 816-824
    • (2007) IEICE Trans. Inf. Systems E , vol.90 -D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 37
    • 77949917834 scopus 로고    scopus 로고
    • Tokuda, K, Zen, H, Yamagishi, J, Masuko, T, Sako, S, Black, A, Nose, T, 2008. The HMM-based speech synthesis system (HTS) Version 2.1
    • Tokuda, K., Zen, H., Yamagishi, J., Masuko, T., Sako, S., Black, A., Nose, T., 2008. The HMM-based speech synthesis system (HTS) Version 2.1. .
  • 38
    • 70449126171 scopus 로고    scopus 로고
    • The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge
    • Yamagishi, J., Zen, H., Wu, Y.-J., Toda, T., Tokuda, K., 2008. The HTS-2008 system: yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge. In: Proc. Blizzard Challenge 2008.
    • (2008) Proc. Blizzard Challenge
    • Yamagishi, J.1    Zen, H.2    Wu, Y.-J.3    Toda, T.4    Tokuda, K.5
  • 40
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T., 1999. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. In: Proc. EUROSPEECH-99, pp. 2374-2350.
    • (1999) Proc. EUROSPEECH-99 , pp. 2374-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 41
    • 7044242284 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • (in Japanese)
    • Yoshimura T., Tokuda K., Masuko T., Kobayashi T., and Kitamura T. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. IEICE Trans. J. 83-D-II 11 (2000) 2099-2107 (in Japanese)
    • (2000) IEICE Trans. J. , vol.83 -D-II , Issue.11 , pp. 2099-2107
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 42
    • 33846405723 scopus 로고    scopus 로고
    • Details of Nitech HMM based speech synthesis system for the Blizzard Challenge 2005
    • Zen H., Toda T., Nakamura M., and Tokuda K. Details of Nitech HMM based speech synthesis system for the Blizzard Challenge 2005. IEICE Trans. Inf. Systems E 90-D 1 (2007) 325-333
    • (2007) IEICE Trans. Inf. Systems E , vol.90 -D , Issue.1 , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 44
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Zen H., Tokuda K., and Black A.W. Statistical parametric speech synthesis. Speech Comm. 51 11 (2009) 1039-1064
    • (2009) Speech Comm. , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.