메뉴 건너뛰기




Volumn 2015-August, Issue , 2015, Pages 4460-4464

Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis

Author keywords

acoustic model; bottleneck feature; deep neural network; multi task learning; Speech synthesis

Indexed keywords

AUDIO SIGNAL PROCESSING; COMPLEX NETWORKS; LINGUISTICS; MAPPING; SPEECH COMMUNICATION; SPEECH SYNTHESIS;

EID: 84946033275     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2015.7178814     Document Type: Conference Paper
Times cited : (264)

References (25)
  • 1
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Heiga Zen, Keiichi Tokuda, and Alan W Black, Statistical parametric speech synthesis, Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 2
    • 84910105608 scopus 로고    scopus 로고
    • Measuring a decade of progress in text-tospeech
    • Simon King, Measuring a decade of progress in text-tospeech, Loquens, vol. 1, no. 1, 2014
    • (2014) Loquens , vol.1 , Issue.1
    • King, S.1
  • 5
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • Toda Tomoki and Keiichi Tokuda, A speech parameter generation algorithm considering global variance for HMM-based speech synthesis, IEICE Transactions on Information and Systems, vol. 90, no. 5, pp. 816-824, 2007
    • (2007) IEICE Transactions on Information and Systems , vol.90 , Issue.5 , pp. 816-824
    • Tomoki, T.1    Tokuda, K.2
  • 6
    • 33749573927 scopus 로고    scopus 로고
    • Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
    • Heiga Zen, Keiichi Tokuda, and Tadashi Kitamura, Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences, Computer Speech &Language, vol. 21, no. 1, pp. 153-173, 2007
    • (2007) Computer Speech &Language , vol.21 , Issue.1 , pp. 153-173
    • Zen, H.1    Tokuda, K.2    Kitamura, T.3
  • 9
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Zhen-Hua Ling, Li Deng, and Dong Yu, Modeling spectral envelopes using Restricted Boltzmann Machines and Deep Belief Networks for statistical parametric speech synthesis, IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2129-2139, 2013
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.10 , pp. 2129-2139
    • Ling, Z.-H.1    Deng, L.2    Yu, D.3
  • 10
    • 84929157442 scopus 로고    scopus 로고
    • Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis
    • Heng Lu, Simon King, and Oliver Watts, Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis, Proc. the 8th ISCA Speech Synthesis Workshop (SSW), 2013
    • (2013) Proc the 8th ISCA Speech Synthesis Workshop (SSW)
    • Lu, H.1    King, S.2    Watts, O.3
  • 14
    • 84910047819 scopus 로고    scopus 로고
    • TTS synthesis with bidirectional LSTM based recurrent neural networks
    • Yuchen Fan, Yao Qian, Fenglong Xie, and Frank K. Soong, TTS synthesis with bidirectional LSTM based recurrent neural networks, in Proc. Interspeech, 2014
    • (2014) Proc. Interspeech
    • Fan, Y.1    Qian, Y.2    Xie, F.3    Soong, F.K.4
  • 16
    • 56449095373 scopus 로고    scopus 로고
    • A unified architecture for natural language processing: Deep neural networks with multitask learning
    • Ronan Collobert and Jason Weston, A unified architecture for natural language processing: Deep neural networks with multitask learning, in Proc. IEEE Int. Conf. on Machine Learning (ICML), 2008
    • (2008) Proc. IEEE Int. Conf. on Machine Learning (ICML)
    • Collobert, R.1    Jason Weston2
  • 18
    • 84865785753 scopus 로고    scopus 로고
    • Improved bottleneck features using pretrained deep neural networks
    • Dong Yu and Michael L Seltzer, Improved bottleneck features using pretrained deep neural networks, in Proc. Interspeech, 2011
    • (2011) Proc. Interspeech
    • Yu, D.1    Seltzer, M.L.2
  • 21
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • Hideki Kawahara, Ikuyo Masuda-Katsuse, and Alain Cheveigné, Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, Speech communication, vol. 27, no. 3, pp. 187-207, 1999
    • (1999) Speech Communication , vol.27 , Issue.3 , pp. 187-207
    • Kawahara, H.1    Ikuyo, M.-K.2    Cheveigné, A.3
  • 23
    • 0345443172 scopus 로고    scopus 로고
    • Glimpsing speech
    • Martin Cooke, Glimpsing speech, Journal of Phonetics, vol. 31, pp. 579-584, 2003
    • (2003) Journal of Phonetics , vol.31 , pp. 579-584
    • Cooke, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.