SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 7825-7829

Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis

(3) Ling, Zhen Hua a,b Deng, Li c Yu, Dong c

a National Engineering Laboratory for Speech and Language Information Processing (China)

b UNIVERSITY OF WASHINGTON (United States)

c MICROSOFT RESEARCH (United States)

Author keywords

hidden Markov model; restricted Boltzmann machine; spectral envelope; Speech synthesis

Indexed keywords

CONVENTIONAL METHODS; GAUSSIAN MIXTURE MODEL; GENERALIZATION ABILITY; HMM-BASED SPEECH SYNTHESIS; JOINT DISTRIBUTIONS; RESTRICTED BOLTZMANN MACHINE; SPECTRAL ENVELOPES; STATISTICAL PARAMETRIC SPEECH SYNTHESIS;

HIDDEN MARKOV MODELS; SIGNAL PROCESSING;

SPEECH SYNTHESIS;

EID: 84890447002 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6639187 Document Type: Conference Paper

Times cited : (43)

References (15)

1
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Eurospeech, 1999, pp. 2347-2350.
- (1999) Eurospeech , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

2
- 0033708106
- Speech parameter generation algorithms for HMMbased speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMMbased speech synthesis," in ICASSP, 2000, vol. 3, pp. 1315-1318.
- (2000) ICASSP , vol.3 , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

3
- 0032673049
- Restructuring speech representations using pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds," Speech Communication, vol. 27, pp. 187-207, 1999.
- (1999) Speech Communication , vol.27 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigne, A.³

4
- 33846405723
- Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005
- H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005," IEICE Trans. Inf. &Syst., vol. E90-D, no. 1, pp. 325-333, 2007.
- (2007) IEICE Trans. Inf. &Syst. , vol.E90-D , Issue.1 , pp. 325-333
- Zen, H.¹ Toda, T.² Nakamura, M.³ Tokuda, K.⁴

5
- 34547496747
- Ustc system for blizzard challenge 2006: An improved hmm-based speech synthesis method
- Z.-H. Ling, Y.-J. Wu, Y.-P. Wang, L. Qin, and R.-H. Wang, "USTC system for Blizzard Challenge 2006: an improved HMM-based speech synthesis method," in Blizzard Challenge Workshop, 2006.
- (2006) Blizzard Challenge Workshop
- Ling, Z.-H.¹ Wu, Y.-J.² Wang, Y.-P.³ Qin, L.⁴ Wang, R.-H.⁵

6
- 0000329993
- Information processing in dynamical systems: Foundations of harmony theory
- D.E. Rumelhart and McClelland J.L., Eds., chapter 6MIT Press
- P. Smolensky, "Information processing in dynamical systems: Foundations of harmony theory," in Parallel Distributed Processing, D.E. Rumelhart and McClelland J.L., Eds., vol. 1, chapter 6, pp. 194-281. MIT Press, 1986.
- (1986) Parallel Distributed Processing , vol.1 , pp. 194-281
- Smolensky, P.¹

7
- 78651276374
- Ph.D. thesis, University of Toronto
- R. Salakhutdinov, Learning deep generative models, Ph.D. thesis, University of Toronto, 2009.
- (2009) Learning Deep Generative Models
- Salakhutdinov, R.¹

8
- 0013344078
- Training products of experts by minimizing contrastive divergence
- G.E Hinton, "Training products of experts by minimizing contrastive divergence," Neural Computation, vol. 14, no. 8, pp. 1711-1800, 2002.
- (2002) Neural Computation , vol.14 , Issue.8 , pp. 1711-1800
- Hinton, G.E.¹

9
- 33746600649
- Reducing the dimensionality of data with neural networks
- G.E. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, vol. 313, no. 5786, pp. 504-507, 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.²

10
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G.E. Dahl, and G.E. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. Speech Audio Process., vol. 20, no. 1, pp. 14-22, 2012.
- (2012) IEEE Trans. Speech Audio Process. , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.E.³

11
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- G.E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Speech Audio Process., vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. Speech Audio Process. , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

12
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G.E. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Andrew Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.E.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Andrew Sainath, T.¹⁰ Kingsbury, B.¹¹

13
- 79959842828
- Binary coding of speech spectrograms using a deep auto-encoder
- L. Deng, M. Seltzer, D. Yu, A. Acero, A. Mohamed, and G.E. Hinton, "Binary coding of speech spectrograms using a deep auto-encoder," in Interspeech, 2010, pp. 1692-1695.
- (2010) Interspeech , pp. 1692-1695
- Deng, L.¹ Seltzer, M.² Yu, D.³ Acero, A.⁴ Mohamed, A.⁵ Hinton, G.E.⁶

14
- 84874282835
- A deep neural network for acoustic-articulatory speech inversion
- B. Uria, S. Renals, and K. Richmond, "A deep neural network for acoustic-articulatory speech inversion," in NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
- (2011) NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning
- Uria, B.¹ Renals, S.² Richmond, K.³

15
- 33745805403
- A fast learning algorithm for deep belief nets
- G.E Hinton, S. Osindero, and Y.W. Teh, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, no. 7, pp. 1527-1554, 2006.
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.W.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.