메뉴 건너뛰기




Volumn 28, Issue 5, 2014, Pages 1209-1232

On the impact of excitation and spectral parameters for expressive statistical parametric speech synthesis

Author keywords

Expressive speech synthesis; Speech parameterization; Speech synthesis; Statistical parametric speech synthesis

Indexed keywords

SPEECH PROCESSING; SPEECH SYNTHESIS;

EID: 84902548006     PISSN: 08852308     EISSN: 10958363     Source Type: Journal    
DOI: 10.1016/j.csl.2013.10.001     Document Type: Article
Times cited : (4)

References (45)
  • 1
    • 33644694381 scopus 로고    scopus 로고
    • Emotions in vowel segments of continuous speech: Analysis of the glottal flow using the normalised amplitude quotient
    • DOI 10.1159/000091405
    • M. Airas, and P. Alku Emotions in vowel segments of continuous speech: analysis of the glottal flow using the normalized amplitude quotient Phonetica 63 1 2006 26 46 (Pubitemid 43333524)
    • (2006) Phonetica , vol.63 , Issue.1 , pp. 26-46
    • Airas, M.1    Alku, P.2
  • 2
    • 70450163450 scopus 로고    scopus 로고
    • Comparison of multiple voice source parameters in different phonation types
    • M. Airas, and P. Alku Comparison of multiple voice source parameters in different phonation types Proc. of Interspeech 2007 1410 1413
    • (2007) Proc. of Interspeech , pp. 1410-1413
    • Airas, M.1    Alku, P.2
  • 3
    • 0036339929 scopus 로고    scopus 로고
    • Normalized amplitude quotient for parametrization of the glottal flow
    • DOI 10.1121/1.1490365
    • P. Alku, and T. Backstrom Normalized amplitude and quotient for parameterization of the glottal flow Journal of the Acoustical Society of America 112 August (2) 2002 701 710 (Pubitemid 34855925)
    • (2002) Journal of the Acoustical Society of America , vol.112 , Issue.2 , pp. 701-710
    • Alku, P.1    Backstrom, T.2    Vilkman, E.3
  • 8
    • 79955528226 scopus 로고    scopus 로고
    • Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation
    • T. Drugman, B. Bozkurt, and T. Dutoit Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation Speech Communication 53 2011 855 866
    • (2011) Speech Communication , vol.53 , pp. 855-866
    • Drugman, T.1    Bozkurt, B.2    Dutoit, T.3
  • 9
    • 70450204573 scopus 로고    scopus 로고
    • A deterministic plus stochastic model of residual signal for improved parametric speech synthesis
    • T. Drugman, G. Wilfart, and T. Dutoit A deterministic plus stochastic model of residual signal for improved parametric speech synthesis Proc. of Interspeech 2009 1779 1782
    • (2009) Proc. of Interspeech , pp. 1779-1782
    • Drugman, T.1    Wilfart, G.2    Dutoit, T.3
  • 11
    • 33947684811 scopus 로고
    • A four-parameter model of the glottal flow
    • G. Fant, J. Liljencrants, and Q. Lin A four-parameter model of the glottal flow STL-QPSR 26 4 1985 001 013
    • (1985) STL-QPSR , vol.26 , Issue.4 , pp. 001-013
    • Fant, G.1    Liljencrants, J.2    Lin, Q.3
  • 13
    • 85016140477 scopus 로고
    • An adaptive algorithm for mel-cepstral analysis of speech
    • T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai An adaptive algorithm for mel-cepstral analysis of speech Proc. of ICASSP 1992 137 140
    • (1992) Proc. of ICASSP , pp. 137-140
    • Fukada, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 14
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M.J.F. Gales Maximum likelihood linear transforms for HMM-based speech recognition Computer Speech and Language 12 April (2) 1998 75 98 (Pubitemid 128383747)
    • (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 16
    • 0035472456 scopus 로고    scopus 로고
    • Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech
    • DOI 10.1109/89.952489, PII S1063667601082335
    • P.J. Jackson, and C.H. Shadle Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech IEEE Transactions on Speech and Audio Processing 9 October (7) 2001 713 726 (Pubitemid 32992835)
    • (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.7 , pp. 713-726
    • Jackson, P.J.B.1    Shadle, C.H.2
  • 18
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
    • H. Kawahara, J. Estill, and O. Fujimura Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT Proc. of MAVEBA 2001 13 18
    • (2001) Proc. of MAVEBA , pp. 13-18
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 21
    • 84878387086 scopus 로고    scopus 로고
    • Analysis on the importance of short-term speech parameterizations for emotional statistical parametric speech synthesis
    • R. Maia, and M. Akamine Analysis on the importance of short-term speech parameterizations for emotional statistical parametric speech synthesis Proc. of Interspeech 2012
    • (2012) Proc. of Interspeech
    • Maia, R.1    Akamine, M.2
  • 22
    • 84876205258 scopus 로고    scopus 로고
    • Complex cepstrum for statistical parametric speech synthesis
    • June 55
    • R. Maia, M. Akamine, and M. Gales Complex cepstrum for statistical parametric speech synthesis Speech Communication 5 June (55) 2013 606 618
    • (2013) Speech Communication , vol.5 , pp. 606-618
    • Maia, R.1    Akamine, M.2    Gales, M.3
  • 23
    • 84867616957 scopus 로고    scopus 로고
    • Complex cepstrum as phase information for statistical parametric speech synthesis
    • R. Maia, M. Akamine, and M.F.J. Gales Complex cepstrum as phase information for statistical parametric speech synthesis Proc. of ICASSP 2012 4581 4584
    • (2012) Proc. of ICASSP , pp. 4581-4584
    • Maia, R.1    Akamine, M.2    Gales, M.F.J.3
  • 24
    • 84906246236 scopus 로고    scopus 로고
    • Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesis
    • (in press)
    • R. Maia, M. Gales, Y. Stylianou, and M. Akamine Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesis Proc. of Interspeech 2013 (in press)
    • (2013) Proc. of Interspeech
    • Maia, R.1    Gales, M.2    Stylianou, Y.3    Akamine, M.4
  • 28
    • 0001052406 scopus 로고
    • Discrete representation of signals
    • June 6
    • A.V. Oppenheim, and D.H. Johnson Discrete representation of signals Proceedings of IEEE 60 June (6) 1972 681 691
    • (1972) Proceedings of IEEE , vol.60 , pp. 681-691
    • Oppenheim, A.V.1    Johnson, D.H.2
  • 30
    • 0032595183 scopus 로고    scopus 로고
    • Modeling of the glottal flow derivative waveform with application to speaker identification
    • September 5
    • M.D. Plumpe, T.F. Quatieri, and D.A. Reynolds Modeling of the glottal flow derivative waveform with application to speaker identification IEEE Transactions on Speech and Audio Processing 7 September (5) 1999 569 586
    • (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , pp. 569-586
    • Plumpe, M.D.1    Quatieri, T.F.2    Reynolds, D.A.3
  • 31
  • 35
    • 0029209272 scopus 로고
    • Robust text-independent speaker identification using Gaussian mixture speaker models
    • January 1
    • D.A. Reynolds, and R.C. Rose Robust text-independent speaker identification using Gaussian mixture speaker models IEEE Transactions on Speech and Audio Processing 3 January (1) 1995 72 83
    • (1995) IEEE Transactions on Speech and Audio Processing , vol.3 , pp. 72-83
    • Reynolds, D.A.1    Rose, R.C.2
  • 36
    • 79959855615 scopus 로고    scopus 로고
    • Cluster analysis of differential spectral envelopes on emotional speech
    • G. Salvi, F. Tesser, E. Zovato, and P. Cosi Cluster analysis of differential spectral envelopes on emotional speech Proc. of Interspeech 2010 322 325
    • (2010) Proc. of Interspeech , pp. 322-325
    • Salvi, G.1    Tesser, F.2    Zovato, E.3    Cosi, P.4
  • 37
    • 84865709194 scopus 로고    scopus 로고
    • Clustering expressive speech styles in audiobooks using glottal source parameters
    • E. Székely, J.P. Cabral, P. Cahill, and J. Carson-Berndsen Clustering expressive speech styles in audiobooks using glottal source parameters Proc. of Interspeech 2011 2409 2412
    • (2011) Proc. of Interspeech , pp. 2409-2412
    • Székely, E.1    Cabral, J.P.2    Cahill, P.3    Carson-Berndsen, J.4
  • 39
    • 85131821539 scopus 로고
    • Mel-generalized cepstral analysis - A unified approach to speech spectral estimation
    • K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai Mel-generalized cepstral analysis - a unified approach to speech spectral estimation Proc. of ICSLP 1994 1043 1046
    • (1994) Proc. of ICSLP , pp. 1043-1046
    • Tokuda, K.1    Kobayashi, T.2    Masuko, T.3    Imai, S.4
  • 42
    • 79955538498 scopus 로고    scopus 로고
    • Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis
    • July 6
    • K. Yu, H. Zen, F. Mairesse, and S. Young Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis Speech Communication 53 July (6) 2011 914 923
    • (2011) Speech Communication , vol.53 , pp. 914-923
    • Yu, K.1    Zen, H.2    Mairesse, F.3    Young, S.4
  • 44
    • 33846405723 scopus 로고    scopus 로고
    • Details of the Nitech HMM-based speech synthesis for Blizzard Challenge 2005
    • January 1
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda Details of the Nitech HMM-based speech synthesis for Blizzard Challenge 2005 IEICE Transactions on Information and Systems E90-D January (1) 2005 325 333
    • (2005) IEICE Transactions on Information and Systems , vol.90 E -D , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 45
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • November 11
    • H. Zen, K. Tokuda, and A. Black Statistical parametric speech synthesis Speech Communication 51 November (11) 2009 1039 1064
    • (2009) Speech Communication , vol.51 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.