메뉴 건너뛰기




Volumn , Issue , 2014, Pages 1969-1973

Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort

Author keywords

Deep neural network; DNN; Glottal flow; Speech synthesis; Vocal effort; Voice source modelling

Indexed keywords

SPEECH SYNTHESIS; TIME DOMAIN ANALYSIS;

EID: 84910068090     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (27)

References (40)
  • 1
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in hmm-based speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, " in Proc. Euro speech, 1999, pp. 2374-2350.
    • (1999) Proc. Euro Speech , pp. 2350-2374
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 2
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Commun., vol. 51, no. 11, pp. 1039- 1064, 2009.
    • (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 7
    • 51449114529 scopus 로고    scopus 로고
    • A style control technique for hmm-based expressive speech synthesis
    • T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis, " IEICE Trans. Inf. Syst., vol. E90-D, no. 9, pp. 1406-1413, 2007.
    • (2007) IEICE Trans. Inf. Syst , vol.E90-D , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 8
    • 67650854725 scopus 로고    scopus 로고
    • Analysis of speaker adaptation algorithms for hmm-based speech synthesis and a constrained smaplr adaptation algorithm
    • J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm, " IEEE Trans. Audio, Speech and Lang. Proc., vol. 17, no. 1, pp. 66-83, 2009.
    • (2009) IEEE Trans. Audio, Speech and Lang. Proc. , vol.17 , Issue.1 , pp. 66-83
    • Yamagishi, J.1    Kobayashi, T.2    Nakano, Y.3    Ogata, K.4    Isogai, J.5
  • 9
    • 33846935000 scopus 로고    scopus 로고
    • Hmm-based korean speech synthesis system for hand-held devices
    • S.-J. Kim, J.-J. Kim, and M.-S. Hahn, "HMM-based Korean speech synthesis system for hand-held devices, " IEEE Trans. Consum. Electron., vol. 52, no. 4, pp. 1384-1390, 2006.
    • (2006) IEEE Trans. Consum. Electron. , vol.52 , Issue.4 , pp. 1384-1390
    • Kim, S.-J.1    Kim, J.-J.2    Hahn, M.-S.3
  • 10
    • 79959839868 scopus 로고    scopus 로고
    • Quantized hmms for low footprint text-to-speech synthesis
    • A. Gutkin, X. Gonzalvo, S. Breuer, and P. Taylor, "Quantized HMMs for low footprint text-to-speech synthesis, " in Proc. Inter speech, 2010, pp. 837-840.
    • (2010) Proc. Inter Speech , pp. 837-840
    • Gutkin, A.1    Gonzalvo, X.2    Breuer, S.3    Taylor, P.4
  • 12
    • 0016495091 scopus 로고
    • Linear prediction: A tutorial review
    • J. Makhoul, "Linear prediction: A tutorial review, " Proceedings of the IEEE, vol. 63, no. 4, pp. 561-580, 1975.
    • (1975) Proceedings of the IEEE , vol.63 , Issue.4 , pp. 561-580
    • Makhoul, J.1
  • 14
    • 33846406459 scopus 로고    scopus 로고
    • Two-band excitation for hmm-based speech synthesis
    • S. J. Kim and M. Hahn, "Two-band excitation for HMM-based speech synthesis, " IEICE Trans. Inf. & Syst., vol. E90-D, 2007.
    • (2007) IEICE Trans. Inf. & Syst. , vol.E90-D
    • Kim, S.J.1    Hahn, M.2
  • 15
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigńe, "Restructuring speech representations using a pitch-adaptive time frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds, " Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
    • (1999) Speech Commun. , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigné, A.3
  • 18
    • 84865797109 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters
    • R. Maia, H. Zen, and M. J. F. Gales, "Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters, " in 7th ISCA Speech Synthesis Workshop, 2010, pp. 88-93.
    • (2010) 7th ISCA Speech Synthesis Workshop , pp. 88-93
    • Maia, R.1    Zen, H.2    Gales, M.J.F.3
  • 19
    • 82155160991 scopus 로고    scopus 로고
    • Towards an improved modeling of the glottal source in statistical parametric speech synthesis
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Towards an improved modeling of the glottal source in statistical parametric speech synthesis, " in Sixth ISCA Workshop on Speech Synthesis, 2007, pp. 113-118.
    • (2007) Sixth ISCA Workshop on Speech Synthesis , pp. 113-118
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 20
    • 84867224654 scopus 로고    scopus 로고
    • Glottal spectral separation for parametric speech synthesis
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Glottal spectral separation for parametric speech synthesis, " in Proc. Inter speech, 2008, pp. 1829-1832.
    • (2008) Proc. Inter Speech , pp. 1829-1832
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 21
    • 0015699693 scopus 로고
    • The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer
    • J. Holmes, "The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer, " IEEE Trans. Audio and Electro acoustics, vol. 21, no. 3, pp. 298-305, 1973.
    • (1973) IEEE Trans. Audio and Electro Acoustics , vol.21 , Issue.3 , pp. 298-305
    • Holmes, J.1
  • 23
    • 77957796737 scopus 로고
    • Hybrid time- And frequency-domain speech synthesis with extended glottal source generation
    • G. Fries, "Hybrid time- And frequency-domain speech synthesis with extended glottal source generation, " in Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP), vol. 1, 1994, pp. 581-584.
    • (1994) Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP) , vol.1 , pp. 581-584
    • Fries, G.1
  • 24
    • 84867209230 scopus 로고    scopus 로고
    • Hmm based finnish text-to-speech system utilizing glottal inverse filtering
    • T. Raitio, A. Suni, H. Pulakka, M. Vainio, and P. Alku, "HMM based Finnish text-to-speech system utilizing glottal inverse filtering, " in Proc. Inter speech, 2008, pp. 1881-1884.
    • (2008) Proc. Inter Speech , pp. 1881-1884
    • Raitio, T.1    Suni, A.2    Pulakka, H.3    Vainio, M.4    Alku, P.5
  • 27
    • 79959855183 scopus 로고    scopus 로고
    • Excitation modeling based on waveform interpolation for hmm-based speech synthesis
    • J. Sung, D. Hong, K. Oh, and N. Kim, "Excitation modeling based on waveform interpolation for HMM-based speech synthesis, " in Proc. Inter speech, 2010, pp. 813-816.
    • (2010) Proc. Inter Speech , pp. 813-816
    • Sung, J.1    Hong, D.2    Oh, K.3    Kim, N.4
  • 28
    • 84856248602 scopus 로고    scopus 로고
    • The deterministic plus stochastic model of the residual signal and its applications
    • T. Drugman and T. Dutoit, "The deterministic plus stochastic model of the residual signal and its applications, " IEEE Trans. Audio, Speech and Lang. Proc., vol. 20, no. 3, pp. 968-981, 2012.
    • (2012) IEEE Trans. Audio, Speech and Lang. Proc. , vol.20 , Issue.3 , pp. 968-981
    • Drugman, T.1    Dutoit, T.2
  • 30
    • 70450204573 scopus 로고    scopus 로고
    • A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis
    • T. Drugman, G. Wilfart, and T. Dutoit, "A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis, " in Proc. Inter speech, 2009, pp. 1779-1782.
    • (2009) Proc. Inter Speech , pp. 1779-1782
    • Drugman, T.1    Wilfart, G.2    Dutoit, T.3
  • 32
    • 84890555694 scopus 로고    scopus 로고
    • Analysis and hmm-based synthesis of hypo and hyperarticulated speech
    • B. Picart, T. Drugman, and T. Dutoit, "Analysis and HMM-based synthesis of hypo and hyperarticulated speech, " Computer Speech & Language, vol. 28, no. 2, pp. 687-707, 2014.
    • (2014) Computer Speech & Language , vol.28 , Issue.2 , pp. 687-707
    • Picart, B.1    Drugman, T.2    Dutoit, T.3
  • 33
    • 84890547237 scopus 로고    scopus 로고
    • Synthesis and perception of breathy, normal, and lombard speech in the presence of noise
    • T. Raitio, A. Suni, M. Vainio, and P. Alku, "Synthesis and perception of breathy, normal, and lombard speech in the presence of noise, " Computer Speech & Language, vol. 28, no. 2, pp. 648- 664, 2014.
    • (2014) Computer Speech & Language , vol.28 , Issue.2 , pp. 648-664
    • Raitio, T.1    Suni, A.2    Vainio, M.3    Alku, P.4
  • 36
    • 0026881384 scopus 로고
    • Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
    • P. Alku, "Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering, " Speech Commun., vol. 11, no. 2-3, pp. 109-118, 1992.
    • (1992) Speech Commun. , vol.11 , Issue.2-3 , pp. 109-118
    • Alku, P.1
  • 40
    • 43549113834 scopus 로고    scopus 로고
    • Nonlinear source-filter coupling in phonation: Theory
    • I. R. Titze, "Nonlinear source-filter coupling in phonation: Theory, " J. Acoust. Soc. Am., vol. 123, no. 5, pp. 2733-2749, 2008.
    • (2008) J. Acoust. Soc. Am. , vol.123 , Issue.5 , pp. 2733-2749
    • Titze, I.R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.