SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 2268-2272

Prosody contour prediction with long short-term memory, bi-directional, deep recurrent neural networks

(4) Fernandez, Raul a Rendel, Asaf b Ramabhadran, Bhuvana a Hoory, Ron b

a IBM T J WATSON RESEARCH CENTER (United States)

b IBM HAIFA RESEARCH LAB (Israel)

Author keywords

Deep learning; Prosody prediction; Recurrent neural networks; Speech synthesis; Text to speech

Indexed keywords

FORECASTING; MEAN SQUARE ERROR; SPEECH COMMUNICATION; SPEECH SYNTHESIS;

CONTEXTUAL FACTORS; DEEP LEARNING; DEEP NEURAL NETWORKS; LONG SHORT-TERM MEMORY; PROSODY PREDICTIONS; RELATIVE REDUCTION; STATE-OF-THE-ART PERFORMANCE; TEXT TO SPEECH;

RECURRENT NEURAL NETWORKS;

EID: 84910068142 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (89)

References (16)

1
- 0033708106
- Speech parameter generation algorithms for hmm-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis, " in ICASSP, 2000, pp. 1315-1318.
- (2000) ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

2
- 84890522099
- F0 contour prediction with a deep belief network-gaussian process hybrid model
- R. Fernandez, R. Rendel, B. Ramabhadran, and R. Hoory, "F0 contour prediction with a Deep Belief Network-Gaussian Process hybrid model, " in ICASSP, 2013, pp. 6885-6889.
- (2013) ICASSP , pp. 6885-6889
- Fernandez, R.¹ Rendel, R.² Ramabhadran, B.³ Hoory, R.⁴

3
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using Deep Neural Networks, " in ICASSP, 2013, pp. 7962-7966.
- (2013) ICASSP , pp. 7962-7966
- Zen, H.¹ Senior, A.² Schuster, M.³

4
- 84901237776
- Modeling spectral envelops using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
- Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelops using Restricted Boltzmann Machines and Deep Belief Networks for statistical parametric speech synthesis, " IEEE Trans. Audio, Speech, and Lang. Proc., vol. 21, no. 10, pp. 2129-2139, 2013.
- (2013) IEEE Trans. Audio, Speech, and Lang. Proc. , vol.21 , Issue.10 , pp. 2129-2139
- Ling, Z.-H.¹ Deng, L.² Yu, D.³

5
- 84890527090
- Multi-distribution deep belief networks for speech synthesis
- S. Kang, X. Qian, and H. Meng, "Multi-distribution Deep Belief Networks for speech synthesis, " in ICASSP, 2013, pp. 8012-8016.
- (2013) ICASSP , pp. 8012-8016
- Kang, S.¹ Qian, X.² Meng, H.³

6
- 84890545600
- Multi-task learning in deep neural networks for improved phoneme recognition
- M. L. Seltzer and J. Droppo, "Multi-task learning in Deep Neural Networks for improved phoneme recognition, " in Proc. ICASSP, 2013, pp. 6965-6969.
- (2013) Proc. ICASSP , pp. 6965-6969
- Seltzer, M.L.¹ Droppo, J.²

7
- 71249112130
- Offline handwriting recognition with multidimensional recurrent neural networks
- A. Graves and J. Schmidhuber, "Offline handwriting recognition with multidimensional Recurrent Neural Networks, " in NIPS, 2009.
- (2009) NIPS
- Graves, A.¹ Schmidhuber, J.²

8
- 84890543083
- Speech recognition with deep recurrent neural networks
- A. Graves, M. Abdel-rahman, and G. Hinton, "Speech recognition with Deep Recurrent Neural Networks, " in ICASSP, 2013, pp. 6885-6889.
- (2013) ICASSP , pp. 6885-6889
- Graves, A.¹ Abdel-Rahman, M.² Hinton, G.³

9
- 56449118171
- Phrase-level phonology in speech production planning: Evidence for the role of prosodic structure
- G. Bruce and M. Horne, Eds. Netherlands: Springer
- S. Shattuck-Hufnagel, "Phrase-level phonology in speech production planning: Evidence for the role of prosodic structure, " in Prosody: Theory and Experiment. Studies Presented to Gösta Bruce, ser. Text, Speech and Language Technology, G. Bruce and M. Horne, Eds. Netherlands: Springer, 2000, vol. 14, pp. 201- 229.
- (2000) Prosody: Theory and Experiment. Studies Presented to Gösta Bruce, Ser. Text, Speech and Language Technology , vol.14 , pp. 201-229
- Shattuck-Hufnagel, S.¹

10
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

11
- 0034293152
- Learning to forget: Continual prediction with lstm
- F. A. Gers, J. Schmidhuber, and F. Cummings, "Learning to forget: Continual prediction with LSTM, " Neural Computaiton, vol. 12, no. 10, pp. 2451-2471, 2000.
- (2000) Neural Computaiton , vol.12 , Issue.10 , pp. 2451-2471
- Gers, F.A.¹ Schmidhuber, J.² Cummings, F.³

12
- 0041965934
- Learning precise timing with lstm recurrent networks
- F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, "Learning precise timing with LSTM Recurrent Networks, " J. of Machine Learning Research, vol. 3, pp. 115-143, 2002.
- (2002) J. of Machine Learning Research , vol.3 , pp. 115-143
- Gers, F.A.¹ Schraudolph, N.N.² Schmidhuber, J.³

13
- 84943274699
- A direct adaptive method for faster back propagation learning: The rprop algorithm
- M. Riedmiller and H. Braun, "A direct adaptive method for faster back propagation learning: The RPROP algorithm, " in Proc. IEEE Intnl. Conf. on Neural Networks, 1993, pp. 586-591.
- (1993) Proc. IEEE Intnl. Conf. on Neural Networks , pp. 586-591
- Riedmiller, M.¹ Braun, H.²

14
- 84883877323
- A. Graves, "RNNLIB: A Recurrent Neural Network library for sequence learning problems, " http://sourceforge.net/projects/rnnl/.
- RNNLIB: A Recurrent Neural Network Library for Sequence Learning Problems
- Graves, A.¹

15
- 33745200051
- Speech parameter generation algorithm considering global variance for hmm-based speech synthesis
- T. Toda and K. Tokuda, "Speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " in Inter speech, 2005, pp. 2801-2804.
- (2005) Inter Speech , pp. 2801-2804
- Toda, T.¹ Tokuda, K.²

16
- 80051607565
- Crowdmos: An approach for crowd sourcing mean opinion score studies
- F. Ribeiro, D. Floreâncio, C. Zhang, and M. Seltzer, "CROWDMOS: An approach for crowd sourcing Mean Opinion Score studies, " in ICASSP, 2011, pp. 2416-2419.
- (2011) ICASSP , pp. 2416-2419
- Ribeiro, F.¹ Florêncio, D.² Zhang, C.³ Seltzer, M.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.