SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2016-May, Issue , 2016, Pages 5120-5124

High-pitched excitation generation for glottal vocoding in statistical parametric speech synthesis using a deep neural network

(4) Juvela, Lauri a Bollepalli, Bajibabu a Airaksinen, Manu a Alku, Paavo a

a AALTO UNIVERSITY (Finland)

Author keywords

Deep neural network; Glottal inverse filtering; Glottal vocoder; QCP; Statistical parametric speech synthesis

Indexed keywords

EID: 84973293681 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2016.7472653 Document Type: Conference Paper

Times cited : (35)

References (29)

1
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, and Tadashi Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, " in Proc. of Interspeech, 1999, pp. 2347-2350.
- (1999) Proc. of Interspeech , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

2
- 67651002140
- Statistical parametric speech synthesis
- Heiga Zen, Keiichi Tokuda, and Alan W. Black, "Statistical parametric speech synthesis, " Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009.
- (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Black, A.W.³

3
- 84876687945
- Speech synthesis based on hidden markov models
- May
- Keiichi Tokuda, Yoshihiko Nankaku, Tomoki Toda, Heiga Zen, Junichi Yamagishi, and Keiichiro Oura, "Speech synthesis based on hidden markov models, " Proceedings of the IEEE, vol. 101, no. 5, pp. 1234-1252, May 2013.
- (2013) Proceedings of the IEEE , vol.101 , Issue.5 , pp. 1234-1252
- Tokuda, K.¹ Nankaku, Y.² Toda, T.³ Zen, H.⁴ Yamagishi, J.⁵ Oura, K.⁶

4
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- May
- Heiga Zen, Andrew Senior, and Mike Schuster, "Statistical parametric speech synthesis using deep neural networks, " in Proc. of ICASSP, May 2013, pp. 7962-7966.
- (2013) Proc. of ICASSP , pp. 7962-7966
- Zen, H.¹ Senior, A.² Schuster, M.³

5
- 85032750981
- Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
- May
- Zhen-Hua Ling, Shi-Yin Kang, Heiga Zen, Andrew Senior, Mike Schuster, Xiao-Jun Qian, Helen Meng, and Li Deng, "Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends, " Signal Processing Magazine, IEEE, vol. 32, no. 3, pp. 35-52, May 2015.
- (2015) Signal Processing Magazine, IEEE , vol.32 , Issue.3 , pp. 35-52
- Ling, Z.¹ Kang, S.² Zen, H.³ Senior, A.⁴ Schuster, M.⁵ Qian, X.⁶ Meng, H.⁷ Deng, L.⁸

6
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- Hideki Kawahara, Ikuyo Masuda-Katsuse, and Alain De Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech communication, vol. 27, no. 3, pp. 187-207, 1999.
- (1999) Speech Communication , vol.27 , Issue.3 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigne, A.³

7
- 84874199000
- Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
- Hideki Kawahara, Jo Estill, and Osamu Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight, " in MAVEBA, 2001.
- (2001) MAVEBA
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

8
- 0032638660
- On phase perception in speech
- Mar
- Harald Pobloth and W. Bastiaan Kleijn, "On phase perception in speech, " in Proc. of ICASSP, Mar 1999, vol. 1, pp. 29-32 vol. 1.
- (1999) Proc. of ICASSP , vol.1 , pp. 29-32
- Pobloth, H.¹ Bastiaan Kleijn, W.²

9
- 84959096758
- Phase perception of the glottal excitation of vocoded speech
- Dresden, September
- Tuomo Raitio, Lauri Juvela, Antti Suni, Martti Vainio, and Paavo Alku, "Phase perception of the glottal excitation of vocoded speech, " in Proc. of Interspeech, Dresden, September 2015, pp. 254-258.
- (2015) Proc. of Interspeech , pp. 254-258
- Raitio, T.¹ Juvela, L.² Suni, A.³ Vainio, M.⁴ Alku, P.⁵

10
- 84867209230
- HMM-based Finnish text-to-speech system utilizing glottal inverse filtering
- Brisbane, Australia, September
- Tuomo Raitio, Antti Suni, Hannu Pulakka, Martti Vainio, and Paavo Alku, "HMM-based Finnish text-to-speech system utilizing glottal inverse filtering, " in Proc. of Interspeech, Brisbane, Australia, September 2008, pp. 1881-1884.
- (2008) Proc. of Interspeech , pp. 1881-1884
- Raitio, T.¹ Suni, A.² Pulakka, H.³ Vainio, M.⁴ Alku, P.⁵

11
- 77957744515
- HMMbased speech synthesis utilizing glottal inverse filtering
- January
- Tuomo Raitio, Antti Suni, Junichi Yamagishi, Hannu Pulakka, Jani Nurminen, Martti Vainio, and Paavo Alku, "HMMbased speech synthesis utilizing glottal inverse filtering, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 1, pp. 153-165, January 2011.
- (2011) IEEE Transactions on Audio, Speech, and Language Processing , vol.19 , Issue.1 , pp. 153-165
- Raitio, T.¹ Suni, A.² Yamagishi, J.³ Pulakka, H.⁴ Nurminen, J.⁵ Vainio, M.⁶ Alku, P.⁷

12
- 84865755765
- The GlottHMM speech synthesis entry for Blizzard Challenge 2010
- Kyoto, Japan, September
- Antti Suni, Tuomo Raitio, Martti Vainio, and Paavo Alku, "The GlottHMM speech synthesis entry for Blizzard Challenge 2010, " in Blizzard Challenge 2010 Workshop, Kyoto, Japan, September 2010.
- (2010) Blizzard Challenge 2010 Workshop
- Suni, A.¹ Raitio, T.² Vainio, M.³ Alku, P.⁴

13
- 84911869827
- Voice source modelling using deep neural networks for statistical parametric speech synthesis
- Lisbon, Portugal, September
- Tuomo Raitio, Heng Lu, John Kane, Antti Suni, Martti Vainio, Simon King, and Paavo Alku, "Voice source modelling using deep neural networks for statistical parametric speech synthesis, " in 22nd European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, September 2014.
- (2014) 22nd European Signal Processing Conference (EUSIPCO)
- Raitio, T.¹ Lu, H.² Kane, J.³ Suni, A.⁴ Vainio, M.⁵ King, S.⁶ Alku, P.⁷

14
- 84910068090
- Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort
- Singapore, September
- Tuomo Raitio, Antti Suni, Lauri Juvela, Martti Vainio, and Paavo Alku, "Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort, " in Proc. of Interspeech, Singapore, September 2014, pp. 1969-1973.
- (2014) Proc. of Interspeech , pp. 1969-1973
- Raitio, T.¹ Suni, A.² Juvela, L.³ Vainio, M.⁴ Alku, P.⁵

15
- 84890547237
- Synthesis and perception of breathy, normal, and Lombard speech in the presence of noise
- March
- Tuomo Raitio, Antti Suni, Martti Vainio, and Paavo Alku, "Synthesis and perception of breathy, normal, and Lombard speech in the presence of noise, " Computer Speech & Language, vol. 28, no. 2, pp. 648-664, March 2014.
- (2014) Computer Speech & Language , vol.28 , Issue.2 , pp. 648-664
- Raitio, T.¹ Suni, A.² Vainio, M.³ Alku, P.⁴

16
- 84942607168
- A deep generative architecture for postfiltering in statistical parametric speech synthesis
- Nov
- Ling-Hui Chen, Tuomo Raitio, Cassia Valentini-Botinhao, Zhen-Hua Ling, and Junichi Yamagishi, "A deep generative architecture for postfiltering in statistical parametric speech synthesis, " Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 23, no. 11, pp. 2003-2014, Nov 2015.
- (2015) Audio, Speech, and Language Processing, IEEE/ACM Transactions on , vol.23 , Issue.11 , pp. 2003-2014
- Chen, L.¹ Raitio, T.² Valentini-Botinhao, C.³ Ling, Z.⁴ Yamagishi, J.⁵

17
- 84898074254
- Quasi closed phase glottal inverse filtering analysis with weighted linear prediction
- March
- Manu Airaksinen, Tuomo Raitio, Brad Story, and Paavo Alku, "Quasi closed phase glottal inverse filtering analysis with weighted linear prediction, " Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 22, no. 3, pp. 596-607, March 2014.
- (2014) Audio, Speech, and Language Processing, IEEE/ACM Transactions on , vol.22 , Issue.3 , pp. 596-607
- Airaksinen, M.¹ Raitio, T.² Story, B.³ Alku, P.⁴

18
- 0026881384
- Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
- Eurospeech '91
- Paavo Alku, "Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering, " Speech Communication, vol. 11, no. 2-3, pp. 109-118, 1992, Eurospeech '91.
- (1992) Speech Communication , vol.11 , Issue.2-3 , pp. 109-118
- Alku, P.¹

19
- 84890448428
- The GlottHMM entry for Blizzard Challenge 2011: Utilizing source unit selection in HMM-based speech synthesis for improved excitation generation
- Turin, Italy, September
- Antti Suni, Tuomo Raitio, Martti Vainio, and Paavo Alku, "The GlottHMM entry for Blizzard Challenge 2011: Utilizing source unit selection in HMM-based speech synthesis for improved excitation generation, " in Blizzard Challenge 2011 Workshop, Turin, Italy, September 2011.
- (2011) Blizzard Challenge 2011 Workshop
- Suni, A.¹ Raitio, T.² Vainio, M.³ Alku, P.⁴

20
- 84856294347
- Glottal inverse filtering analysis of human voice production - A review of estimation and parameterization methods of the glottal excitation and their applications. (invited article)
- Paavo Alku, "Glottal inverse filtering analysis of human voice production-a review of estimation and parameterization methods of the glottal excitation and their applications. (invited article), " Sadhana-Academy Proceedings in Engineering Sciences, vol. 36, no. 5, pp. 623-650, 2011.
- (2011) Sadhana-Academy Proceedings in Engineering Sciences , vol.36 , Issue.5 , pp. 623-650
- Alku, P.¹

21
- 0016495091
- Linear prediction: A tutorial review
- Apr
- John Makhoul, "Linear prediction: A tutorial review, " Proceedings of the IEEE, vol. 63, no. 4, pp. 561-580, Apr 1975.
- (1975) Proceedings of the IEEE , vol.63 , Issue.4 , pp. 561-580
- Makhoul, J.¹

22
- 84882383984
- Formant frequency estimation of high-pitched vowels using weighted linear predictiona)
- Paavo Alku, Jouni Pohjalainen, Martti Vainio, Anne-Maria Laukkanen, and Brad Story, "Formant frequency estimation of high-pitched vowels using weighted linear predictiona), " The Journal of the Acoustical Society of America, vol. 134, no. 2, 2013.
- (2013) The Journal of the Acoustical Society of America , vol.134 , Issue.2
- Alku, P.¹ Pohjalainen, J.² Vainio, M.³ Laukkanen, A.⁴ Story, B.⁵

23
- 70450198169
- Glottal closure and opening instant detection from speech signals
- Thomas Drugman and Thierry Dutoit, "Glottal closure and opening instant detection from speech signals., " in Proc. of Interspeech, 2009, pp. 2891-2894.
- (2009) Proc. of Interspeech , pp. 2891-2894
- Drugman, T.¹ Dutoit, T.²

24
- 84863419425
- Detection of glottal closure instants from speech signals: A quantitative review
- March
- Thomas Drugman, Mark Thomas, Jon Gudnason, Patrick Naylor, and Thierry Dutoit, "Detection of glottal closure instants from speech signals: A quantitative review, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 3, pp. 994-1006, March 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.3 , pp. 994-1006
- Drugman, T.¹ Thomas, M.² Gudnason, J.³ Naylor, P.⁴ Dutoit, T.⁵

25
- 33646773080
- Tech. Rep., Language Technologies Institute
- John Kominek and Alan W. Black, "CMU ARCTIC databases for speech synthesis, " Tech. Rep., Language Technologies Institute.
- CMU ARCTIC Databases for Speech Synthesis
- Kominek, J.¹ Black, A.W.²

26
- 84857819132
- Theano: A CPU and GPU math expression compiler
- Oral Presentation, June
- James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio, "Theano: a CPU and GPU math expression compiler, " in Proc. of the Python for Scientific Computing Conference (SciPy), June 2010, Oral Presentation.
- (2010) Proc. of the Python for Scientific Computing Conference (SciPy)
- Bergstra, J.¹ Breuleux, O.² Bastien, F.³ Lamblin, P.⁴ Pascanu, R.⁵ Desjardins, G.⁶ Turian, J.⁷ Warde-Farley, D.⁸ Bengio, Y.⁹

27
- 84897544737
- Theano: New features and speed improvements
- Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio, "Theano: new features and speed improvements, " Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, 2012.
- (2012) Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop
- Bastien, F.¹ Lamblin, P.² Pascanu, R.³ Bergstra, J.⁴ Goodfellow, I.J.⁵ Bergeron, A.⁶ Bouchard, N.⁷ Bengio, Y.⁸

28
- 85133720638
- The HMM-based speech synthesis system version 2. 0
- Bonn, Germany, August
- Heiga Zen, Takashi Nose, Junichi Yamagishi, Shinji Sako, Takashi Masuko, Alan W. Black, and Keiichi Tokuda, "The HMM-based speech synthesis system version 2. 0, " in Proc. of ISCA SSW6, Bonn, Germany, August 2007, pp. 294-299.
- (2007) Proc. of ISCA SSW6 , pp. 294-299
- Zen, H.¹ Nose, T.² Yamagishi, J.³ Sako, S.⁴ Masuko, T.⁵ Black, A.W.⁶ Tokuda, K.⁷

29
- 84973276298
- Methods for subjective determination of transmission quality
- 800 ITU-T SG12 Geneva, Switzerland, Aug.
- "Methods for Subjective Determination of Transmission Quality, " Recommendation P. 800, ITU-T SG12, Geneva, Switzerland, Aug. 1996.
- (1996) Recommendation

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.