메뉴 건너뛰기




Volumn , Issue , 2017, Pages

Char2Wav: End-to-end speech synthesis

Author keywords

[No Author keywords available]

Indexed keywords

DECODING; RECURRENT NEURAL NETWORKS; SIGNAL ENCODING; VOCODERS;

EID: 85122685393     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (361)

References (43)
  • 3
    • 84965179228 scopus 로고    scopus 로고
    • Scheduled sampling for sequence prediction with recurrent neural networks
    • C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett eds, Curran Associates, Inc
    • Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (eds.), Advances in Neural Information Processing Systems 28, pp. 1171-1179. Curran Associates, Inc., 2015.
    • (2015) Advances in Neural Information Processing Systems , vol.28 , pp. 1171-1179
    • Bengio, S.1    Vinyals, O.2    Jaitly, N.3    Shazeer, N.4
  • 4
    • 0142166851 scopus 로고    scopus 로고
    • A neural probabilistic language model
    • March ISSN
    • Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. A neural probabilistic language model. J. Mach. Learn. Res., 3:1137-1155, March 2003. ISSN 1532-4435.
    • (2003) J. Mach. Learn. Res. , vol.3 , pp. 1137-1155
    • Bengio, Y.1    Ducharme, R.2    Vincent, P.3    Janvin, C.4
  • 7
    • 84965139600 scopus 로고    scopus 로고
    • Attention-based models for speech recognition
    • C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett eds, Curran Associates, Inc
    • Jan K Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. Attention-based models for speech recognition. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (eds.), Advances in Neural Information Processing Systems 28, pp. 577-585. Curran Associates, Inc., 2015.
    • (2015) Advances in Neural Information Processing Systems , vol.28 , pp. 577-585
    • Chorowski, J.K.1    Bahdanau, D.2    Serdyuk, D.3    Cho, K.4    Bengio, Y.5
  • 9
    • 85162557101 scopus 로고    scopus 로고
    • Practical variational inference for neural networks
    • J. Shawe-taylor, R.s. Zemel, Bartlett, F.c.n. Pereira, and K.q. Weinberger eds
    • Alex Graves. Practical variational inference for neural networks. In J. Shawe-taylor, R.s. Zemel, P. Bartlett, F.c.n. Pereira, and K.q. Weinberger (eds.), Advances in Neural Information Processing Systems 24, pp. 2348-2356. 2011.
    • (2011) Advances in Neural Information Processing Systems , vol.24 , pp. 2348-2356
    • Graves, A.1
  • 12
    • 84919832465 scopus 로고    scopus 로고
    • Towards end-to-end speech recognition with recurrent neural networks
    • Tony Jebara and Eric Xing eds, JMLR Workshop and Conference Proceedings
    • Alex Graves and Navdeep Jaitly. Towards end-to-end speech recognition with recurrent neural networks. In Tony Jebara and Eric P. Xing (eds.), Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1764-1772. JMLR Workshop and Conference Proceedings, 2014.
    • (2014) Proceedings of the 31st International Conference on Machine Learning (ICML-14) , pp. 1764-1772
    • Graves, A.1    Jaitly, N.2
  • 17
    • 84959120024 scopus 로고    scopus 로고
    • Sequence-to-sequence neural net models for grapheme-to-phoneme conversion
    • May
    • Geoffrey Zweig Kaisheng Yao. Sequence-to-sequence neural net models for grapheme-to-phoneme conversion. ISCA - International Speech Communication Association, May 2015.
    • (2015) ISCA - International Speech Communication Association
    • Yao, G.Z.K.1
  • 18
    • 84910105608 scopus 로고    scopus 로고
    • Measuring a decade of progress in text-to-speech
    • Simon King. Measuring a decade of progress in text-to-speech. Loquens, 1(1), 1 2014. ISSN 2386-2637.
    • (2014) Loquens , vol.1 , Issue.1 , pp. 1
    • King, S.1
  • 21
    • 85032750981 scopus 로고    scopus 로고
    • Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
    • Zhen-Hua Ling, Shiyin Kang, Heiga Zen, Andrew Senior, Mike Schuster, Xiao-Jun Qian, Helen Meng, and Li Deng. Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends. IEEE Signal Processing Magazine, 32:35-52, 2015.
    • (2015) IEEE Signal Processing Magazine , vol.32 , pp. 35-52
    • Ling, Z.-H.1    Kang, S.2    Zen, H.3    Senior, A.4    Schuster, M.5    Qian, X.-J.6    Meng, H.7    Deng, L.8
  • 23
    • 84937959846 scopus 로고    scopus 로고
    • Recurrent models of visual attention
    • Z. Ghahramani, M. Welling, C. Cortes, Lawrence, and K.q. Weinberger eds, Curran Associates, Inc
    • Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. Recurrent models of visual attention. In Z. Ghahramani, M. Welling, C. Cortes, N.d. Lawrence, and K.q. Weinberger (eds.), Advances in Neural Information Processing Systems 27, pp. 2204-2212. Curran Associates, Inc., 2014.
    • (2014) Advances in Neural Information Processing Systems , vol.27 , pp. 2204-2212
    • Mnih, V.1    Heess, N.2    Graves, A.3    Kavukcuoglu, K.4
  • 24
    • 84976902575 scopus 로고    scopus 로고
    • World: A vocoder-based high-quality speech synthesis system for real-time applications
    • Masanori Morise, Fumiya Yokomori, and Kenji Ozawa. World: A vocoder-based high-quality speech synthesis system for real-time applications. IEICE Transactions on Information and Systems, E99.D(7):1877-1884, 2016.
    • (2016) IEICE Transactions on Information and Systems , vol.E99 , Issue.7 , pp. 1877-1884
    • Morise, M.1    Yokomori, F.2    Ozawa, K.3
  • 27
    • 84928547704 scopus 로고    scopus 로고
    • Sequence to sequence learning with neural networks
    • Z. Ghahramani, M. Welling, C. Cortes, Lawrence, and K.q. Weinberger eds, Curran Associates, Inc
    • Ilya Sutskever, Oriol Vinyals, and Quoc Le. Sequence to sequence learning with neural networks. In Z. Ghahramani, M. Welling, C. Cortes, N.d. Lawrence, and K.q. Weinberger (eds.), Advances in Neural Information Processing Systems 27, pp. 3104-3112. Curran Associates, Inc., 2014.
    • (2014) Advances in Neural Information Processing Systems , vol.27 , pp. 3104-3112
    • Sutskever, I.1    Vinyals, O.2    Le, Q.3
  • 28
    • 84925160976 scopus 로고    scopus 로고
    • Cambridge University Press, Cambridge
    • Paul Taylor. Text-to-Speech Synthesis. Cambridge University Press, Cambridge, 2009.
    • (2009) Text-to-Speech Synthesis
    • Taylor, P.1
  • 32
    • 84876687945 scopus 로고    scopus 로고
    • Speech synthesis based on hidden markov models
    • May ISSN
    • Keiichi Tokuda, Yoshihiko Nankaku, Tomoki Toda, Heiga Zen, Junichi Yamagishi, and Keiichiro Oura. Speech synthesis based on hidden markov models. Proceedings of the IEEE, 101(5): 1234-1252, May 2013. ISSN 0018-9219.
    • (2013) Proceedings of the IEEE , vol.101 , Issue.5 , pp. 1234-1252
    • Tokuda, K.1    Nankaku, Y.2    Toda, T.3    Zen, H.4    Yamagishi, J.5    Oura, K.6
  • 35
    • 84959112868 scopus 로고    scopus 로고
    • A study of speaker adaptation for dnn-based speech synthesis
    • Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, and Simon King. A study of speaker adaptation for dnn-based speech synthesis. In INTERSPEECH, pp. 879-883. ISCA, 2015.
    • (2015) INTERSPEECH , pp. 879-883
    • Wu, Z.1    Swietojanski, P.2    Veaux, C.3    Renals, S.4    King, S.5
  • 40
    • 84973282956 scopus 로고    scopus 로고
    • Acoustic modeling in statistical parametric speech synthesis - From hmm to lstm-rnn
    • Invited paper
    • Heiga Zen. Acoustic modeling in statistical parametric speech synthesis - from hmm to lstm-rnn. In Proc. MLSLP, 2015. Invited paper.
    • (2015) Proc. MLSLP
    • Zen, H.1
  • 41
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Heiga Zen, Keiichi Tokuda, and Alan W Black. Statistical parametric speech synthesis. Speech Communication, 51(11):1039-1064, 2009.
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 43
    • 84994314564 scopus 로고    scopus 로고
    • Fast, compact, and high quality lstm-rnn based statistical parametric speech synthesizers for mobile devices
    • San Francisco, CA, USA
    • Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, and Przemysław Szczepa-niak. Fast, compact, and high quality lstm-rnn based statistical parametric speech synthesizers for mobile devices. In Proc. Interspeech, San Francisco, CA, USA, 2016.
    • (2016) Proc. Interspeech
    • Zen, H.1    Agiomyrgiannakis, Y.2    Egberts, N.3    Henderson, F.4    Szczepa-Niak, P.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.