메뉴 건너뛰기




Volumn 2016-May, Issue , 2016, Pages 5600-5604

Trajectory training considering global variance for speech synthesis based on neural networks

Author keywords

global variance; neural network; Speech synthesis; statistical model; trajectory model

Indexed keywords


EID: 84973375140     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2016.7472749     Document Type: Conference Paper
Times cited : (28)

References (21)
  • 1
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 7
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks, " Proceedings of ICASSP 2013, pp. 7962-7966, 2013
    • (2013) Proceedings of ICASSP 2013 , pp. 7962-7966
    • Zen, H.1    Senior, A.2    Schuster, M.3
  • 8
    • 84929157442 scopus 로고    scopus 로고
    • Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis
    • H. Lu, S. King, and O. Watts, "Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis, " Proceedings of ISCA SSW8, pp. 281-285, 2013
    • (2013) Proceedings of ISCA SSW8 , pp. 281-285
    • Lu, H.1    King, S.2    Watts, O.3
  • 9
    • 84905251808 scopus 로고    scopus 로고
    • On the training aspects of deep neural network (DNN) for parametric TTS syn-thesis
    • Y. Qian, Y. Fan, H. Wenping, and F. K. Soong, "On the training aspects of deep neural network (DNN) for parametric TTS syn-thesis, " Proceedings of ICASSP 2014, pp. 3857-3861, 2014
    • (2014) Proceedings of ICASSP 2014 , pp. 3857-3861
    • Qian, Y.1    Fan, Y.2    Wenping, H.3    Soong, F.K.4
  • 10
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algo-rithm considering global variance for HMM-based speech syn-thesis
    • T. Toda and K. Tokuda, "A speech parameter generation algo-rithm considering global variance for HMM-based speech syn-thesis, " IEICE Transactions on Information & Systems, vol. E90-D, no. 5, pp. 816-824, 2007
    • (2007) IEICE Transactions on Information & Systems , vol.E90D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 11
    • 84890495160 scopus 로고    scopus 로고
    • Fast, low-artifact speech synthesis considering global variance
    • M. Shannon andW. Byrne, "Fast, low-artifact speech synthesis considering global variance, " Proceedings of ICASSP 2013, pp. 7869-7873, 2013
    • (2013) Proceedings of ICASSP 2013 , pp. 7869-7873
    • Shannon, M.1    Byrne, W.2
  • 12
    • 67650826181 scopus 로고    scopus 로고
    • Trajectory training considering global variance for HMM-based speech synthesis
    • T. Toda and S. Young, "Trajectory training considering global variance for HMM-based speech synthesis, " Proceedings of ICASSP 2009, pp. 4025-4028, 2009
    • (2009) Proceedings of ICASSP 2009 , pp. 4025-4028
    • Toda, T.1    Young, S.2
  • 13
    • 84946074523 scopus 로고    scopus 로고
    • The effect of neural networks in statistical parametric speech syn-thesis
    • K. Hashimoto, K. Oura, Y. Nankaku, and K. Tokuda, "The effect of neural networks in statistical parametric speech syn-thesis, " Proceedings of ICASSP 2015, pp. 4455-4459, 2015
    • (2015) Proceedings of ICASSP 2015 , pp. 4455-4459
    • Hashimoto, K.1    Oura, K.2    Nankaku, Y.3    Tokuda, K.4
  • 14
    • 33749573927 scopus 로고    scopus 로고
    • Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic features
    • H. Zen, K. Tokuda, and T. Kitamura, "Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic features, " Computer Speech and Language, vol. 21, no. 1, pp. 153-173, 2007
    • (2007) Computer Speech and Language , vol.21 , Issue.1 , pp. 153-173
    • Zen, H.1    Tokuda, K.2    Kitamura, T.3
  • 15
    • 84959135757 scopus 로고    scopus 로고
    • Minimum trajectory error training for deep neural networks, combined with stacked bottleneck fea-tures
    • Z. Wu and S. King, "Minimum trajectory error training for deep neural networks, combined with stacked bottleneck fea-tures, " Proceedings of Interspeech 2015, pp. 309-313, 2015
    • (2015) Proceedings of Interspeech 2015 , pp. 309-313
    • Wu, Z.1    King, S.2
  • 16
    • 84959172579 scopus 로고    scopus 로고
    • Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis
    • Y. Fan, Y. Qian, F. K. Soong, and L. He, "Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis, " Proceedings of Interspeech 2015, pp. 864-868, 2015
    • (2015) Proceedings of Interspeech 2015 , pp. 864-868
    • Fan, Y.1    Qian, Y.2    Soong, F.K.3    He, L.4
  • 17
    • 84910087395 scopus 로고    scopus 로고
    • Sequence er-ror SE minimization training of neural network for voice con-version
    • F. L. Xie, Y. Qian, Y. Fan, F. K. Soong, and H. Li, "Sequence er-ror SE minimization training of neural network for voice con-version, " Proceedings of Interspeech 2014, pp. 2283-2287, 2014
    • (2014) Proceedings of Interspeech 2014 , pp. 2283-2287
    • Xie, F.L.1    Qian, Y.2    Fan, Y.3    Soong, F.K.4    Li, H.5
  • 19
    • 0032673049 scopus 로고    scopus 로고
    • Re-structuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. Cheveigne, "Re-structuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech Communication, vol. 27, pp. 187-207, 1999
    • (1999) Speech Communication , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigne, A.3
  • 20
    • 85135145174 scopus 로고    scopus 로고
    • Acoustic modeling based on the MDL criterion for speech recognition
    • K. Shinoda and T. Watanabe, "Acoustic modeling based on the MDL criterion for speech recognition, " Proceedings of Eu-rospeech 1997, pp. 99-102, 1997
    • (1997) Proceedings of Eu-rospeech 1997 , pp. 99-102
    • Shinoda, K.1    Watanabe, T.2
  • 21
    • 84905262874 scopus 로고    scopus 로고
    • Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis
    • H. Zen and A. Senior, "Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis, " Proceedings of ICASSP 2014, pp. 3872-3876, 2014.
    • (2014) Proceedings of ICASSP 2014 , pp. 3872-3876
    • Zen, H.1    Senior, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.