메뉴 건너뛰기




Volumn , Issue , 2013, Pages 7830-7834

Comparing glottal-flow-excited statistical parametric speech synthesis methods

Author keywords

excitation glottal flow; principal component analysis; pulse library; Statistical parametric speech synthesis

Indexed keywords

EXCITATION METHODS; GLOTTAL FLOW; PRINCIPAL COMPONENTS; STATE OF THE ART; STATISTICAL PARAMETRIC SPEECH SYNTHESIS; SUBJECTIVE LISTENING TEST; SYNTHETIC SPEECH;

EID: 84890462419     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2013.6639188     Document Type: Conference Paper
Times cited : (21)

References (34)
  • 1
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis," Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3
  • 2
    • 0002884330 scopus 로고
    • The government standard linear predictive coding algorithm: LPC-10
    • Apr
    • T. E. Tremain, "The government standard linear predictive coding algorithm: LPC-10," Speech Technology, vol. 1, pp. 40-49, Apr. 1982.
    • (1982) Speech Technology , vol.1 , pp. 40-49
    • Tremain, T.E.1
  • 4
    • 33846406459 scopus 로고    scopus 로고
    • Two-band excitation for HMM-based speech synthesis
    • S. J. Kim and M. Hahn, "Two-band excitation for HMM-based speech synthesis," IEICE Trans. Inf. &Syst., vol. E90-D, 2007.
    • (2007) IEICE Trans. Inf. &Syst. , vol.E90-D
    • Kim, S.J.1    Hahn, M.2
  • 5
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • Apr
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds," Speech Commun., vol. 27, Apr. 1999.
    • (1999) Speech Commun. , vol.27
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigne, A.3
  • 7
    • 78649297510 scopus 로고    scopus 로고
    • An excitation model for HMM-based speech synthesis based on residual modeling
    • Aug
    • R. Maia, T. Toda, H. Zen, Y. Nankaku, and K. Tokuda, "An excitation model for HMM-based speech synthesis based on residual modeling," in SSW6, Aug. 2007.
    • (2007) SSW6
    • Maia, R.1    Toda, T.2    Zen, H.3    Nankaku, Y.4    Tokuda, K.5
  • 8
    • 84865797109 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters
    • Sep
    • R. Maia, H. Zen, and M. J. F. Gales, "Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters," in SSW7, Sep. 2010, pp. 88-93.
    • (2010) SSW7 , pp. 88-93
    • Maia, R.1    Zen, H.2    Gales, M.J.F.3
  • 9
    • 84878525228 scopus 로고    scopus 로고
    • An overview of nitech HMMbased speech synthesis system for Blizzard Challenge 2005
    • H. Zen and T. Toda, "An overview of nitech HMMbased speech synthesis system for Blizzard Challenge 2005," in The Blizzard Challenge 2005 workshop, 2005, http://festvox.org/blizzard.
    • (2005) The Blizzard Challenge 2005 Workshop
    • Zen, H.1    Toda, T.2
  • 10
    • 82155160991 scopus 로고    scopus 로고
    • Towards an improved modeling of the glottal source in statistical parametric speech synthesis
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Towards an improved modeling of the glottal source in statistical parametric speech synthesis," in SSW6, 2007, pp. 113-118.
    • (2007) SSW6 , pp. 113-118
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 11
    • 84867224654 scopus 로고    scopus 로고
    • Glottal spectral separation for parametric speech synthesis
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Glottal spectral separation for parametric speech synthesis," in Proc. Interspeech, 2008, pp. 1829-1832.
    • (2008) Proc. Interspeech , pp. 1829-1832
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 12
    • 0015699693 scopus 로고
    • The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer
    • Jun
    • J. Holmes, "The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer," IEEE Trans. Audio and Electroacoustics, vol. 21, no. 3, pp. 298-305, Jun. 1973.
    • (1973) IEEE Trans. Audio and Electroacoustics , vol.21 , Issue.3 , pp. 298-305
    • Holmes, J.1
  • 13
    • 0026387469 scopus 로고
    • Improving naturalness in text-to-speech synthesis using natural glottal source
    • Apr
    • K. Matsui, S. D. Pearson, K. Hata, and T. Kamai, "Improving naturalness in text-to-speech synthesis using natural glottal source," in Proc. ICASSP, Apr. 1991, vol. 2, pp. 769-772.
    • (1991) Proc. ICASSP , vol.2 , pp. 769-772
    • Matsui, K.1    Pearson, S.D.2    Hata, K.3    Kamai, T.4
  • 14
    • 84867209230 scopus 로고    scopus 로고
    • HMMbased Finnish text-to-speech system utilizing glottal inverse filtering
    • T. Raitio, A. Suni, H. Pulakka,M. Vainio, and P. Alku, "HMMbased Finnish text-to-speech system utilizing glottal inverse filtering," in Proc. Interspeech, 2008, pp. 1881-1884.
    • (2008) Proc. Interspeech , pp. 1881-1884
    • Raitio, T.1    Suni, A.2    Pulakkam. Vainio, H.3    Alku, P.4
  • 17
    • 70450204573 scopus 로고    scopus 로고
    • A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis
    • T. Drugman, G. Wilfart, and T. Dutoit, "A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis," in Proc. Interspeech, 2009, pp. 1779-1782.
    • (2009) Proc. Interspeech , pp. 1779-1782
    • Drugman, T.1    Wilfart, G.2    Dutoit, T.3
  • 18
    • 79959855183 scopus 로고    scopus 로고
    • Excitation modeling based on waveform interpolation for HMM-based speech synthesis
    • J. Sung, D. Hong, K. Oh, and N. Kim, "Excitation modeling based on waveform interpolation for HMM-based speech synthesis," in Proc. Interspeech, 2010, pp. 813-816.
    • (2010) Proc. Interspeech , pp. 813-816
    • Sung, J.1    Hong, D.2    Oh, K.3    Kim, N.4
  • 19
    • 84856248602 scopus 로고    scopus 로고
    • The deterministic plus stochastic model of the residual signal and its applications
    • Mar
    • T. Drugman and T. Dutoit, "The deterministic plus stochastic model of the residual signal and its applications," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 20, no. 3, pp. 968-981, Mar. 2012.
    • (2012) IEEE Trans. on Audio, Speech, and Lang. Proc. , vol.20 , Issue.3 , pp. 968-981
    • Drugman, T.1    Dutoit, T.2
  • 22
    • 84890448428 scopus 로고    scopus 로고
    • The GlottHMM entry for blizzard challenge 2011: Utilizing source unit selection in hmm-based speech synthesis for improved excitation generation
    • A. Suni, T. Raitio, M. Vainio, and P. Alku, "The GlottHMM entry for blizzard challenge 2011: Utilizing source unit selection in hmm-based speech synthesis for improved excitation generation," in The Blizzard Challenge 2011 workshop, 2011, http://festvox.org/blizzard.
    • (2011) The Blizzard Challenge 2011 Workshop
    • Suni, A.1    Raitio, T.2    Vainio, M.3    Alku, P.4
  • 24
    • 0026881384 scopus 로고
    • Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
    • P. Alku, "Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering," Speech Commun., vol. 11, no. 2-3, pp. 109-118, 1992.
    • (1992) Speech Commun. , vol.11 , Issue.2-3 , pp. 109-118
    • Alku, P.1
  • 25
    • 0032875050 scopus 로고    scopus 로고
    • A method for generating natural-sounding speech stimuli for cognitive brain research
    • P. Alku, H. Tiitinen, and R. Naatanen, "A method for generating natural-sounding speech stimuli for cognitive brain research," Clinical Neurophysiology, vol. 110, pp. 1329-1333, 1999.
    • (1999) Clinical Neurophysiology , vol.110 , pp. 1329-1333
    • Alku, P.1    Tiitinen, H.2    Naatanen, R.3
  • 29
    • 70450162429 scopus 로고    scopus 로고
    • Voice source waveform analysis and synthesis using principal component analysis and Gaussian mixture modelling
    • J. Gudnason, M. Thomas, P. Naylor, and D. Ellis, "Voice source waveform analysis and synthesis using principal component analysis and Gaussian mixture modelling," in Proc. Interspeech, 2009, pp. 108-111.
    • (2009) Proc. Interspeech , pp. 108-111
    • Gudnason, J.1    Thomas, M.2    Naylor, P.3    Ellis, D.4
  • 30
    • 80055082229 scopus 로고    scopus 로고
    • Data-driven voice source waveform analysis and synthesis
    • Feb
    • J. Gudnason, M. Thomas, D.P.W. Ellis, and P.A. Naylor, "Data-driven voice source waveform analysis and synthesis," Speech Commun., vol. 54, no. 2, pp. 199-211, Feb. 2012.
    • (2012) Speech Commun. , vol.54 , Issue.2 , pp. 199-211
    • Gudnason, J.1    Thomas, M.2    Ellis, D.P.W.3    Naylor, P.A.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.