메뉴 건너뛰기




Volumn , Issue , 2017, Pages 4895-4899

An autoregressive recurrent mixture density network for parametric speech synthesis

Author keywords

Autoregressive model; Mixture density network; Recurrent neural network; Speech synthesis

Indexed keywords


EID: 85023745327     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2017.7953087     Document Type: Conference Paper
Times cited : (68)

References (27)
  • 1
  • 2
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • Heiga Zen, Alan Senior, and Martin Schuster, "Statistical parametric speech synthesis using deep neural networks," in Proc. ICASSP, 2013, pp. 7962-7966.
    • (2013) Proc. ICASSP , pp. 7962-7966
    • Zen, H.1    Senior, A.2    Schuster, M.3
  • 3
    • 84910047819 scopus 로고    scopus 로고
    • TTS synthesis with bidirectional LSTM based recurrent neural networks
    • Yuchen Fan, Yap Qian, Feilong Xie, and Frank K. Soong, "TTS synthesis with bidirectional LSTM based recurrent neural networks," in Proc. INTERSPEECH, 2014, pp. 1964-1968.
    • (2014) Proc. INTERSPEECH , pp. 1964-1968
    • Fan, Y.1    Qian, Y.2    Xie, F.3    Soong, F.K.4
  • 4
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Zhen-Hua Ling, Li Deng, and Dong Yu, "Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2129-2139, 2013.
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.10 , pp. 2129-2139
    • Ling, Z.-H.1    Li, D.2    Yu, D.3
  • 5
    • 84973309345 scopus 로고    scopus 로고
    • A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis
    • Shinji Takaki and Junichi Yamagishi, "A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis," in Proc. ICASSP, 2016, pp. 5535-5539.
    • (2016) Proc. ICASSP , pp. 5535-5539
    • Takaki, S.1    Yamagishi, J.2
  • 6
  • 9
    • 84867625378 scopus 로고    scopus 로고
    • Autoregressive HMM speech synthesis
    • Carl Quillen, "Autoregressive HMM speech synthesis," in Proc. ICASSP, 2012, pp. 4021-4024.
    • (2012) Proc. ICASSP , pp. 4021-4024
    • Quillen, C.1
  • 10
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • Tomoki Toda and Keiichi Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Transactions on Information and Systems, vol. 90, no. 5, pp. 816-824, 2007.
    • (2007) IEICE Transactions on Information and Systems , vol.90 , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 12
    • 84905262874 scopus 로고    scopus 로고
    • Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis
    • Heiga Zen and Andrew Senior, "Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis," in Proc. ICASSP, 2014, pp. 3844-3848.
    • (2014) Proc. ICASSP , pp. 3844-3848
    • Zen, H.1    Senior, A.2
  • 13
    • 84898948282 scopus 로고    scopus 로고
    • Better generative models for sequential data problems: Bidirectional recurrent mixture density networks
    • Mike Schuster, "Better generative models for sequential data problems: Bidirectional recurrent mixture density networks," in Proc. NIPS, 1999, pp. 589-595.
    • (1999) Proc. NIPS , pp. 589-595
    • Schuster, M.1
  • 14
    • 33749573927 scopus 로고    scopus 로고
    • Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
    • Heiga Zen, Keiichi Tokuda, and Tadashi Kitamura, "Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences," Computer Speech & Language, vol. 21, no. 1, pp. 153-173, 2007.
    • (2007) Computer Speech & Language , vol.21 , Issue.1 , pp. 153-173
    • Zen, H.1    Tokuda, K.2    Kitamura, T.3
  • 15
    • 84973375140 scopus 로고    scopus 로고
    • Trajectory training considering global variance for speech synthesis based on neural networks
    • Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda, "Trajectory training considering global variance for speech synthesis based on neural networks," in Proc. ICASSP, 2016, pp. 5600-5604.
    • (2016) Proc. ICASSP , pp. 5600-5604
    • Hashimoto, K.1    Oura, K.2    Nankaku, Y.3    Tokuda, K.4
  • 16
    • 79959847165 scopus 로고    scopus 로고
    • A formulation of the autoregressive HMM for speech synthesis
    • CUED/F-INFENG/TR.629
    • Matt Shannon and William Byrne, "A formulation of the autoregressive HMM for speech synthesis," Tech. Rep., University of Cambridge, CUED/F-INFENG/TR.629, 2009.
    • (2009) Tech. Rep., University of Cambridge
    • Shannon, M.1    Byrne, W.2
  • 17
    • 84946045510 scopus 로고    scopus 로고
    • Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
    • Heiga Zen and Haşim Sak, "Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis," in Proc. ICASSP, 2015, pp. 4470-4474.
    • (2015) Proc. ICASSP , pp. 4470-4474
    • Zen, H.1    Sak, H.2
  • 19
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • Hideki Kawahara, Ikuyo Masuda-Katsuse, and Alain de Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Communication, vol. 27, pp. 187-207, 1999.
    • (1999) Speech Communication , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigne, A.3
  • 21
    • 84930639546 scopus 로고    scopus 로고
    • Introducing CURRENT: The munich open-source CUDA recurrent neural network toolkit
    • Felix Weninger, Johannes Bergmann, and Björn Schuller, "Introducing CURRENT: The Munich open-source CUDA recurrent neural network toolkit," The Journal of Machine Learning Research, vol. 16, no. 1, pp. 547-551, 2015.
    • (2015) The Journal of Machine Learning Research , vol.16 , Issue.1 , pp. 547-551
    • Weninger, F.1    Bergmann, J.2    Schuller, B.3
  • 22
    • 0015360527 scopus 로고
    • Digital inverse filtering: A new tool for formant trajectory estimation
    • Jun
    • John D. Markel, "Digital inverse filtering: A new tool for formant trajectory estimation," IEEE Transactions on Audio and Electroacoustics, vol. 20, no. 2, pp. 129-137, Jun 1972.
    • (1972) IEEE Transactions on Audio and Electroacoustics , vol.20 , Issue.2 , pp. 129-137
    • Markel, J.D.1
  • 26
    • 84994314564 scopus 로고    scopus 로고
    • Fast, compact, and high quality LSTM-RNN based statistical parametric speech synthesizers for mobile devices
    • Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, and Przemyslaw Szczepaniak, "Fast, compact, and high quality LSTM-RNN based statistical parametric speech synthesizers for mobile devices," in Proc. INTERSPEECH, 2016, pp. 2273-2277.
    • (2016) Proc. INTERSPEECH , pp. 2273-2277
    • Zen, H.1    Agiomyrgiannakis, Y.2    Egberts, N.3    Henderson, F.4    Szczepaniak, P.5
  • 27
    • 84905234422 scopus 로고    scopus 로고
    • A postfilter to modify the modulation spectrum in HMM-based speech synthesis
    • Shinnosuke Takamichi, Tomoki Toda, Graham Neubig, Sakriani Sakti, and Satoshi Nakamura, "A postfilter to modify the modulation spectrum in HMM-based speech synthesis," in Proc. ICASSP, 2014, pp. 290-294.
    • (2014) Proc. ICASSP , pp. 290-294
    • Takamichi, S.1    Toda, T.2    Neubig, G.3    Sakti, S.4    Nakamura, S.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.