메뉴 건너뛰기




Volumn , Issue , 2014, Pages 2290-2294

Voice source modelling using deep neural networks for statistical parametric speech synthesis

Author keywords

Deep neural network; DNN; glottal flow; statistical parametric speech synthesis; voice source modelling

Indexed keywords

DEEP NEURAL NETWORKS; SIGNAL PROCESSING; SPEECH SYNTHESIS; TIME DOMAIN ANALYSIS;

EID: 84911869827     PISSN: 22195491     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (36)

References (27)
  • 1
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • T. Yoshimura, K. Tokuda, T.Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. Eurospeech, 1999, pp. 2374-2350.
    • (1999) Proc. Eurospeech , pp. 2374-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 2
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Heiga Zen, Keiichi Tokuda, and Alan W. Black, "Statistical parametric speech synthesis," Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 3
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • A. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. IEEE Int. Conf. Acoust. Speech Signal Proc., 1996, pp. 373-376.
    • (1996) Proc. IEEE Int. Conf. Acoust. Speech Signal Proc , pp. 373-376
    • Hunt, A.1    Black, A.2
  • 4
    • 0016495091 scopus 로고
    • Linear prediction: A tutorial review
    • J. Makhoul, "Linear prediction: A tutorial review," Proceedings of the IEEE, vol. 63, no. 4, pp. 561-580, 1975.
    • (1975) Proceedings of the IEEE , vol.63 , Issue.4 , pp. 561-580
    • Makhoul, J.1
  • 6
    • 33846406459 scopus 로고    scopus 로고
    • Two-band excitation for HMMbased speech synthesis
    • S.-J. Kim and M. Hahn, "Two-band excitation for HMMbased speech synthesis," IEICE Trans. Inf. Syst., vol. E90-D, 2007.
    • (2007) IEICE Trans. Inf. Syst , vol.E90D
    • Kim, S.-J.1    Hahn, M.2
  • 7
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds," Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
    • (1999) Speech Commun , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigné, A.3
  • 10
    • 84865797109 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters
    • R. Maia, H. Zen, and M. J. F. Gales, "Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters," in 7th ISCA Speech Synthesis Workshop, 2010, pp. 88-93.
    • (2010) 7th ISCA Speech Synthesis Workshop , pp. 88-93
    • Maia, R.1    Zen, H.2    Gales, M.J.F.3
  • 11
    • 82155160991 scopus 로고    scopus 로고
    • Towards an improved modeling of the glottal source in statistical parametric speech synthesis
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Towards an improved modeling of the glottal source in statistical parametric speech synthesis," in 6th ISCA Workshop on Speech Synthesis, 2007, pp. 113-118.
    • (2007) 6th ISCA Workshop on Speech Synthesis , pp. 113-118
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 12
    • 84867224654 scopus 로고    scopus 로고
    • Glottal spectral separation for parametric speech synthesis
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Glottal spectral separation for parametric speech synthesis," in Proc. Interspeech, 2008, pp. 1829-1832.
    • (2008) Proc. Interspeech , pp. 1829-1832
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 13
    • 0015699693 scopus 로고
    • The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer
    • J. Holmes, "The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer," IEEE Trans. Audio and Electroac., vol. 21, no. 3, pp. 298-305, 1973.
    • (1973) IEEE Trans. Audio and Electroac , vol.21 , Issue.3 , pp. 298-305
    • Holmes, J.1
  • 15
    • 77957796737 scopus 로고
    • Hybrid time-and frequency-domain speech synthesis with extended glottal source generation
    • G. Fries, "Hybrid time-and frequency-domain speech synthesis with extended glottal source generation," in Proc. IEEE Int. Conf. Acoust. Speech Signal Proc., 1994, vol. 1, pp. 581-584.
    • (1994) Proc. IEEE Int. Conf. Acoust. Speech Signal Proc , vol.1 , pp. 581-584
    • Fries, G.1
  • 16
    • 84867209230 scopus 로고    scopus 로고
    • HMM-based Finnish text-to-speech system utilizing glottal inverse filtering
    • T. Raitio, A. Suni, H. Pulakka, M. Vainio, and P. Alku, "HMM-based Finnish text-to-speech system utilizing glottal inverse filtering," in Proc. Interspeech, 2008, pp. 1881-1884.
    • (2008) Proc. Interspeech , pp. 1881-1884
    • Raitio, T.1    Suni, A.2    Pulakka, H.3    Vainio, M.4    Alku, P.5
  • 18
    • 70450204573 scopus 로고    scopus 로고
    • A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis
    • T. Drugman, G. Wilfart, and T. Dutoit, "A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis," in Proc. Interspeech, 2009, pp. 1779-1782.
    • (2009) Proc. Interspeech , pp. 1779-1782
    • Drugman, T.1    Wilfart, G.2    Dutoit, T.3
  • 19
    • 79959855183 scopus 로고    scopus 로고
    • Excitation modeling based on waveform interpolation for HMM-based speech synthesis
    • J. Sung, D. Hong, K. Oh, and N. Kim, "Excitation modeling based on waveform interpolation for HMM-based speech synthesis," in Proc. Interspeech, 2010, pp. 813-816.
    • (2010) Proc. Interspeech , pp. 813-816
    • Sung, J.1    Hong, D.2    Oh, K.3    Kim, N.4
  • 20
    • 84856248602 scopus 로고    scopus 로고
    • The deterministic plus stochastic model of the residual signal and its applications
    • T. Drugman and T. Dutoit, "The deterministic plus stochastic model of the residual signal and its applications," IEEE Trans. Audio Speech Lang. Proc., vol. 20, no. 3, pp. 968-981, 2012.
    • (2012) IEEE Trans. Audio Speech Lang. Proc , vol.20 , Issue.3 , pp. 968-981
    • Drugman, T.1    Dutoit, T.2
  • 23
    • 80051650578 scopus 로고    scopus 로고
    • Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis
    • T. Raitio, A. Suni, H. Pulakka,M. Vainio, and P. Alku, "Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis," in Proc. IEEE Int. Conf. Acoust. Speech Signal Proc., 2011, pp. 4564-4567.
    • (2011) Proc. IEEE Int. Conf. Acoust. Speech Signal Proc , pp. 4564-4567
    • Raitio, T.1    Suni, A.2    Pulakkam. Vainio, H.3    Alku, P.4
  • 26
    • 0026881384 scopus 로고
    • Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
    • P. Alku, "Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering," Speech Commun., vol. 11, no. 2-3, pp. 109-118, 1992.
    • (1992) Speech Commun , vol.11 , Issue.2-3 , pp. 109-118
    • Alku, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.