SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 7830-7834

Comparing glottal-flow-excited statistical parametric speech synthesis methods

(4) Raitio, Tuomo a Suni, Antti b Vainio, Martti b Alku, Paavo a

a AALTO UNIVERSITY (Finland)

b UNIVERSITY OF HELSINKI (Finland)

Author keywords

excitation glottal flow; principal component analysis; pulse library; Statistical parametric speech synthesis

Indexed keywords

EXCITATION METHODS; GLOTTAL FLOW; PRINCIPAL COMPONENTS; STATE OF THE ART; STATISTICAL PARAMETRIC SPEECH SYNTHESIS; SUBJECTIVE LISTENING TEST; SYNTHETIC SPEECH;

PRINCIPAL COMPONENT ANALYSIS; SIGNAL PROCESSING;

SPEECH SYNTHESIS;

EID: 84890462419 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6639188 Document Type: Conference Paper

Times cited : (21)

References (34)

1
- 67651002140
- Statistical parametric speech synthesis
- H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis," Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009.
- (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Black, A.³

2
- 0002884330
- The government standard linear predictive coding algorithm: LPC-10
- Apr
- T. E. Tremain, "The government standard linear predictive coding algorithm: LPC-10," Speech Technology, vol. 1, pp. 40-49, Apr. 1982.
- (1982) Speech Technology , vol.1 , pp. 40-49
- Tremain, T.E.¹

3
- 85009097254
- Mixed excitation for HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Mixed excitation for HMM-based speech synthesis," in Proc. Eurospeech, 2001, pp. 2259-2262.
- (2001) Proc. Eurospeech , pp. 2259-2262
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

4
- 33846406459
- Two-band excitation for HMM-based speech synthesis
- S. J. Kim and M. Hahn, "Two-band excitation for HMM-based speech synthesis," IEICE Trans. Inf. &Syst., vol. E90-D, 2007.
- (2007) IEICE Trans. Inf. &Syst. , vol.E90-D
- Kim, S.J.¹ Hahn, M.²

5
- 0032673049
- Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- Apr
- H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds," Speech Commun., vol. 27, Apr. 1999.
- (1999) Speech Commun. , vol.27
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigne, A.³

6
- 84874199000
- Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
- Sep
- H. Kawahara, Jo Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT," in 2nd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA), Sep. 2001.
- (2001) 2nd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA)
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

7
- 78649297510
- An excitation model for HMM-based speech synthesis based on residual modeling
- Aug
- R. Maia, T. Toda, H. Zen, Y. Nankaku, and K. Tokuda, "An excitation model for HMM-based speech synthesis based on residual modeling," in SSW6, Aug. 2007.
- (2007) SSW6
- Maia, R.¹ Toda, T.² Zen, H.³ Nankaku, Y.⁴ Tokuda, K.⁵

8
- 84865797109
- Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters
- Sep
- R. Maia, H. Zen, and M. J. F. Gales, "Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters," in SSW7, Sep. 2010, pp. 88-93.
- (2010) SSW7 , pp. 88-93
- Maia, R.¹ Zen, H.² Gales, M.J.F.³

9
- 84878525228
- An overview of nitech HMMbased speech synthesis system for Blizzard Challenge 2005
- H. Zen and T. Toda, "An overview of nitech HMMbased speech synthesis system for Blizzard Challenge 2005," in The Blizzard Challenge 2005 workshop, 2005, http://festvox.org/blizzard.
- (2005) The Blizzard Challenge 2005 Workshop
- Zen, H.¹ Toda, T.²

10
- 82155160991
- Towards an improved modeling of the glottal source in statistical parametric speech synthesis
- J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Towards an improved modeling of the glottal source in statistical parametric speech synthesis," in SSW6, 2007, pp. 113-118.
- (2007) SSW6 , pp. 113-118
- Cabral, J.¹ Renalds, S.² Richmond, K.³ Yamagishi, J.⁴

11
- 84867224654
- Glottal spectral separation for parametric speech synthesis
- J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Glottal spectral separation for parametric speech synthesis," in Proc. Interspeech, 2008, pp. 1829-1832.
- (2008) Proc. Interspeech , pp. 1829-1832
- Cabral, J.¹ Renalds, S.² Richmond, K.³ Yamagishi, J.⁴

12
- 0015699693
- The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer
- Jun
- J. Holmes, "The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer," IEEE Trans. Audio and Electroacoustics, vol. 21, no. 3, pp. 298-305, Jun. 1973.
- (1973) IEEE Trans. Audio and Electroacoustics , vol.21 , Issue.3 , pp. 298-305
- Holmes, J.¹

13
- 0026387469
- Improving naturalness in text-to-speech synthesis using natural glottal source
- Apr
- K. Matsui, S. D. Pearson, K. Hata, and T. Kamai, "Improving naturalness in text-to-speech synthesis using natural glottal source," in Proc. ICASSP, Apr. 1991, vol. 2, pp. 769-772.
- (1991) Proc. ICASSP , vol.2 , pp. 769-772
- Matsui, K.¹ Pearson, S.D.² Hata, K.³ Kamai, T.⁴

14
- 84867209230
- HMMbased Finnish text-to-speech system utilizing glottal inverse filtering
- T. Raitio, A. Suni, H. Pulakka,M. Vainio, and P. Alku, "HMMbased Finnish text-to-speech system utilizing glottal inverse filtering," in Proc. Interspeech, 2008, pp. 1881-1884.
- (2008) Proc. Interspeech , pp. 1881-1884
- Raitio, T.¹ Suni, A.² Pulakkam. Vainio, H.³ Alku, P.⁴

15
- 77957744515
- HMM-based speech synthesis utilizing glottal inverse filtering
- Jan
- T. Raitio, A. Suni, J. Yamagishi, H. Pulakka, J. Nurminen, M. Vainio, and P. Alku, "HMM-based speech synthesis utilizing glottal inverse filtering," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 19, no. 1, pp. 153-165, Jan. 2011.
- (2011) IEEE Trans. on Audio, Speech, and Lang. Proc. , vol.19 , Issue.1 , pp. 153-165
- Raitio, T.¹ Suni, A.² Yamagishi, J.³ Pulakka, H.⁴ Nurminen, J.⁵ Vainio, M.⁶ Alku, P.⁷

16
- 84865755765
- The GlottHMM speech synthesis entry for Blizzard Challenge 2010
- A. Suni, T. Raitio, M. Vainio, and P. Alku, "The GlottHMM speech synthesis entry for Blizzard Challenge 2010," in The Blizzard Challenge 2010 workshop, 2010, http://festvox.org/blizzard.
- (2010) The Blizzard Challenge 2010 Workshop
- Suni, A.¹ Raitio, T.² Vainio, M.³ Alku, P.⁴

17
- 70450204573
- A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis
- T. Drugman, G. Wilfart, and T. Dutoit, "A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis," in Proc. Interspeech, 2009, pp. 1779-1782.
- (2009) Proc. Interspeech , pp. 1779-1782
- Drugman, T.¹ Wilfart, G.² Dutoit, T.³

18
- 79959855183
- Excitation modeling based on waveform interpolation for HMM-based speech synthesis
- J. Sung, D. Hong, K. Oh, and N. Kim, "Excitation modeling based on waveform interpolation for HMM-based speech synthesis," in Proc. Interspeech, 2010, pp. 813-816.
- (2010) Proc. Interspeech , pp. 813-816
- Sung, J.¹ Hong, D.² Oh, K.³ Kim, N.⁴

19
- 84856248602
- The deterministic plus stochastic model of the residual signal and its applications
- Mar
- T. Drugman and T. Dutoit, "The deterministic plus stochastic model of the residual signal and its applications," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 20, no. 3, pp. 968-981, Mar. 2012.
- (2012) IEEE Trans. on Audio, Speech, and Lang. Proc. , vol.20 , Issue.3 , pp. 968-981
- Drugman, T.¹ Dutoit, T.²

20
- 67650793794
- Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis
- Apr
- T. Drugman, G. Wilfart, A. Moinet, and T. Dutoit, "Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis," in Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP), Apr. 2009, pp. 3793-3796.
- (2009) Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP) , pp. 3793-3796
- Drugman, T.¹ Wilfart, G.² Moinet, A.³ Dutoit, T.⁴

21
- 80051650578
- Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis
- T. Raitio, A. Suni, H. Pulakka, M. Vainio, and P. Alku, "Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis," in Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP), 2011, pp. 4564-4567.
- (2011) Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP) , pp. 4564-4567
- Raitio, T.¹ Suni, A.² Pulakka, H.³ Vainio, M.⁴ Alku, P.⁵

22
- 84890448428
- The GlottHMM entry for blizzard challenge 2011: Utilizing source unit selection in hmm-based speech synthesis for improved excitation generation
- A. Suni, T. Raitio, M. Vainio, and P. Alku, "The GlottHMM entry for blizzard challenge 2011: Utilizing source unit selection in hmm-based speech synthesis for improved excitation generation," in The Blizzard Challenge 2011 workshop, 2011, http://festvox.org/blizzard.
- (2011) The Blizzard Challenge 2011 Workshop
- Suni, A.¹ Raitio, T.² Vainio, M.³ Alku, P.⁴

23
- 84890563666
- The GlottHMM entry for blizzard challenge 2012-Hybrid approach
- A. Suni, T. Raitio, M. Vainio, and P. Alku, "The GlottHMM entry for blizzard challenge 2012-hybrid approach," in The Blizzard Challenge 2012 workshop, 2011, http://festvox.org/blizzard.
- (2011) The Blizzard Challenge 2012 Workshop
- Suni, A.¹ Raitio, T.² Vainio, M.³ Alku, P.⁴

24
- 0026881384
- Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
- P. Alku, "Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering," Speech Commun., vol. 11, no. 2-3, pp. 109-118, 1992.
- (1992) Speech Commun. , vol.11 , Issue.2-3 , pp. 109-118
- Alku, P.¹

25
- 0032875050
- A method for generating natural-sounding speech stimuli for cognitive brain research
- P. Alku, H. Tiitinen, and R. Naatanen, "A method for generating natural-sounding speech stimuli for cognitive brain research," Clinical Neurophysiology, vol. 110, pp. 1329-1333, 1999.
- (1999) Clinical Neurophysiology , vol.110 , pp. 1329-1333
- Alku, P.¹ Tiitinen, H.² Naatanen, R.³

26
- 67650800535
- An investigation of spectral parameters for HMMbased speech synthesis
- Sep., (In Japanese)
- M. Marume, H. Zen, Y. Nankaku, K. Tokuda, and T. Kitamura, "An investigation of spectral parameters for HMMbased speech synthesis," in Proc. Autumn Meeting of Acoust. Soc. of Japan, Sep. 2006, (In Japanese).
- (2006) Proc. Autumn Meeting of Acoust. Soc. of Japan
- Marume, M.¹ Zen, H.² Nankaku, Y.³ Tokuda, K.⁴ Kitamura, T.⁵

27
- 84863419425
- Detection of glottal closure instants from speech signals: A quantitative review
- March
- T. Drugman, M. Thomas, J. Gudnason, P. Naylor, and T. Dutoit, "Detection of glottal closure instants from speech signals: A quantitative review," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 20, no. 3, pp. 994-1006, March 2012.
- (2012) IEEE Trans. on Audio, Speech, and Lang. Proc. , vol.20 , Issue.3 , pp. 994-1006
- Drugman, T.¹ Thomas, M.² Gudnason, J.³ Naylor, P.⁴ Dutoit, T.⁵

28
- 69249117913
- Data-driven voice source waveform modelling
- April
- M. Thomas, J. Gudnason, and P. Naylor, "Data-driven voice source waveform modelling," in Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP), April 2009, pp. 3965-3968.
- (2009) Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP) , pp. 3965-3968
- Thomas, M.¹ Gudnason, J.² Naylor, P.³

29
- 70450162429
- Voice source waveform analysis and synthesis using principal component analysis and Gaussian mixture modelling
- J. Gudnason, M. Thomas, P. Naylor, and D. Ellis, "Voice source waveform analysis and synthesis using principal component analysis and Gaussian mixture modelling," in Proc. Interspeech, 2009, pp. 108-111.
- (2009) Proc. Interspeech , pp. 108-111
- Gudnason, J.¹ Thomas, M.² Naylor, P.³ Ellis, D.⁴

30
- 80055082229
- Data-driven voice source waveform analysis and synthesis
- Feb
- J. Gudnason, M. Thomas, D.P.W. Ellis, and P.A. Naylor, "Data-driven voice source waveform analysis and synthesis," Speech Commun., vol. 54, no. 2, pp. 199-211, Feb. 2012.
- (2012) Speech Commun. , vol.54 , Issue.2 , pp. 199-211
- Gudnason, J.¹ Thomas, M.² Ellis, D.P.W.³ Naylor, P.A.⁴

31
- 85133720638
- The HMM-based speech synthesis system (HTS) version 2.0
- Aug
- H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. Black, and K. Tokuda, "The HMM-based speech synthesis system (HTS) version 2.0," in SSW6, Aug. 2007, pp. 294-299.
- (2007) SSW6 , pp. 294-299
- Zen, H.¹ Nose, T.² Yamagishi, J.³ Sako, S.⁴ Masuko, T.⁵ Black, A.⁶ Tokuda, K.⁷

32
- 84890537711
- Nov
- HTS, "HMM-based speech synthesis system," Nov. 2012, http://hts.sp.nitech.ac.jp.
- (2012) HMM-based Speech Synthesis System

33
- 0036522887
- Multispace probability distribution HMM
- K. Tokuda, T.Masuko, N.Miyazaki, and T. Kobayashi, "Multispace probability distribution HMM," IEICE Trans. Inf. &Syst., vol. E85-D, no. 3, pp. 1455-464, 2002.
- (2002) IEICE Trans. Inf. &Syst. , vol.E85-D , Issue.3 , pp. 1455-1464
- Tokuda, K.¹ Masuko, T.² Miyazaki, N.³ Kobayashi, T.⁴

34
- 33646681559
- Ph.D. thesis, University of Helsinki, Finland, Dec
- M. Vainio, Artificial Neural Network Based Prosody Models for Finnish Text-to-Speech Synthesis, Ph.D. thesis, University of Helsinki, Finland, Dec. 2001.
- (2001) Artificial Neural Network Based Prosody Models for Finnish Text-to-Speech Synthesis
- Vainio, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.