SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 2313-2317

Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes

(3) Chen, Ling Hui a Ling, Zhen Hua a Dai, Li Rong a

a UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA (China)

Author keywords

Deep neural network; Spectral envelope; Voice conversion

Indexed keywords

ASSOCIATIVE PROCESSING; COMPLEX NETWORKS; SPEECH COMMUNICATION; SPEECH PROCESSING;

BI-DIRECTIONAL ASSOCIATIVE MEMORY; CONTRASTIVE DIVERGENCE; CONVERSION METHODS; DEEP NEURAL NETWORKS; NONLINEAR MAPPINGS; RESTRICTED BOLTZMANN MACHINE; SPECTRAL ENVELOPES; VOICE CONVERSION;

PHOTOMAPPING;

EID: 84910104946 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (10)

References (22)

1
- 0031623661
- Spectral voice conversion for text-tospeech synthesis
- A. Kain and M. Macon, "Spectral voice conversion for text-tospeech synthesis, " in Proc. ICASSP, 1998, pp. 285-288.
- (1998) Proc. ICASSP , pp. 285-288
- Kain, A.¹ Macon, M.²

2
- 84905560807
- Voice conversion with smoothedGMMand MAP adaptation
- Y. Chen, M. Chu, E. Chang, J. Liu, and R. Liu, "Voice conversion with smoothedGMMand MAP adaptation, " in Eurospeech, 2003, pp. 2413-2416.
- (2003) Eurospeech , pp. 2413-2416
- Chen, Y.¹ Chu, M.² Chang, E.³ Liu, J.⁴ Liu, R.⁵

3
- 78149260085
- Continuous stochastic feature mapping based on trajectory hmms
- H. Zen, Y. Nankaku, and K. Tokuda, "Continuous stochastic feature mapping based on trajectory hmms, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 19, no. 2, pp. 417- 430, 2011.
- (2011) Audio, Speech, and Language Processing, IEEE Transactions on , vol.19 , Issue.2 , pp. 417-430
- Zen, H.¹ Nankaku, Y.² Tokuda, K.³

4
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- nov
- T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 8, pp. 2222 -2235, nov. 2007.
- (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.² Tokuda, K.³

5
- 77953707533
- Spectral mapping using artificial neural networks for voice conversion
- S. Desai, A. Black, B. Yegnanarayana, and K. Prahallad, "Spectral mapping using artificial neural networks for voice conversion, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 18, no. 5, pp. 954-964, 2010.
- (2010) Audio, Speech, and Language Processing, IEEE Transactions on , vol.18 , Issue.5 , pp. 954-964
- Desai, S.¹ Black, A.² Yegnanarayana, B.³ Prahallad, K.⁴

6
- 84906280857
- Voice conversion in high-order eigen space using deep belief nets
- T. Nakashika, T. Takashima, R. Takiguchi, and Y. Ariki, "Voice conversion in high-order eigen space using deep belief nets, " in Proc. Interspeech, 2013, pp. 369-372.
- (2013) Proc. Interspeech , pp. 369-372
- Nakashika, T.¹ Takashima, T.² Takiguchi, R.³ Ariki, Y.⁴

7
- 84889579519
- Conditional restricted Boltzmann machine for voice conversion
- Z. Wu, E. S. Chng, and H. Li, "Conditional restricted Boltzmann machine for voice conversion, " in Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit International Conference on, 2013, pp. 104-108.
- (2013) Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit International Conference on , pp. 104-108
- Wu, Z.¹ Chng, E.S.² Li, H.³

8
- 84906225084
- Joint spectral distribution modeling using restricted Boltzmann machines for voice conversion
- L.-H. Chen, Z.-H. Ling, Y. Song, and L.-R. Dai, "Joint spectral distribution modeling using restricted Boltzmann machines for voice conversion, " in Interspeech, 2013, pp. 3052-3056.
- (2013) Interspeech , pp. 3052-3056
- Chen, L.-H.¹ Ling, Z.-H.² Song, Y.³ Dai, L.-R.⁴

9
- 84905223323
- Using bidirectional associative memories for joint spectral envelope modeling in voice conversion
- L.-J. Liu, L.-H. Chen, Z.-H. Ling, and L.-R. Dai, "Using bidirectional associative memories for joint spectral envelope modeling in voice conversion, " in Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Liu, L.-J.¹ Chen, L.-H.² Ling, Z.-H.³ Dai, L.-R.⁴

10
- 0000329993
- Information processing in dynamical systems: Foundations of harmony theory
- D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA, USA: MIT Press, ch. 6
- P. Smolensky, "Information processing in dynamical systems: foundations of harmony theory, " in Parallel distributed processing: explorations in the microstructure of cognition, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA, USA: MIT Press, 1986, vol. 1, ch. 6, pp. 194-281.
- (1986) Parallel Distributed Processing: Explorations in the Microstructure of Cognition , vol.1 , pp. 194-281
- Smolensky, P.¹

11
- 0023861743
- Bidirectional associative memories
- B. Kosko, "Bidirectional associative memories, " Systems, Man and Cybernetics, IEEE Transactions on, vol. 18, no. 1, pp. 49- 60, 1988.
- (1988) Systems, Man and Cybernetics, IEEE Transactions on , vol.18 , Issue.1 , pp. 49-60
- Kosko, B.¹

12
- 0013344078
- Training products of experts by minimizing contrastive divergence
- G. Hinton, "Training products of experts by minimizing contrastive divergence, " Neural Computation, vol. 12, no. 14, pp. 1711-1800, 2002.
- (2002) Neural Computation , vol.12 , Issue.14 , pp. 1711-1800
- Hinton, G.¹

13
- 0033708106
- Speech parameter generation algorithms for hmm-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for hmm-based speech synthesis, " in Proc. ICASSP, vol. 3, 2000, pp. 1315 -1318.
- (2000) Proc. ICASSP , vol.3 , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

14
- 84890447002
- Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis
- Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis, " in ICASSP, 2013, pp. 7825-7829.
- (2013) ICASSP , pp. 7825-7829
- Ling, Z.-H.¹ Deng, L.² Yu, D.³

15
- 78651276374
- Ph.D. dissertation, University of Toronto
- R. Salakhutdinov, "Learning deep generative models, " Ph.D. dissertation, University of Toronto, 2009.
- (2009) Learning Deep Generative Models
- Salakhutdinov, R.¹

16
- 84901237776
- Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis
- Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 21, no. 10, pp. 2129-2139, 2013.
- (2013) Audio, Speech, and Language Processing, IEEE Transactions on , vol.21 , Issue.10 , pp. 2129-2139
- Ling, Z.-H.¹ Deng, L.² Yu, D.³

17
- 84862286946
- Deep Boltzmann machines
- R. Salakhutdinov and G. E. Hinton, "Deep Boltzmann machines, " in International Conference on Artificial Intelligence and Statistics, 2009, pp. 448-455.
- (2009) International Conference on Artificial Intelligence and Statistics , pp. 448-455
- Salakhutdinov, R.¹ Hinton, G.E.²

18
- 84867585919
- Understanding how deep belief networks perform acoustic modelling
- March
- A. Mohamed, G. Hinton, and G. Penn, "Understanding how deep belief networks perform acoustic modelling, " in ICASSP, March 2012, pp. 4273-4276.
- (2012) ICASSP , pp. 4273-4276
- Mohamed, A.¹ Hinton, G.² Penn, G.³

19
- 84874485803
- Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling
- Dec
- J. Pan, C. Liu, Z. Wang, Y. Hu, and H. Jiang, "Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling, " in Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on, Dec 2012, pp. 301-305.
- (2012) Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on , pp. 301-305
- Pan, J.¹ Liu, C.² Wang, Z.³ Hu, Y.⁴ Jiang, H.⁵

20
- 0032673049
- Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech Communication, vol. 27, no. 3, pp. 187-208, 1999.
- (1999) Speech Communication , vol.27 , Issue.3 , pp. 187-208
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigne, A.³

21
- 84878387361
- PLDA using Gaussian restricted Boltzmann machines with application to speaker verification
- T. Stafylakis, P. Kenny, M. Senoussaoui, and P. Dumouchel, "PLDA using Gaussian restricted Boltzmann machines with application to speaker verification." in Inter speech, 2012.
- (2012) Inter Speech
- Stafylakis, T.¹ Kenny, P.² Senoussaoui, M.³ Dumouchel, P.⁴

22
- 84867720412
- arXiv preprint arXiv:1207.0580
- G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, "Improving neural networks by preventing coadaptation of feature detectors, " arXiv preprint arXiv:1207.0580, 2012.
- (2012) Improving Neural Networks by Preventing Coadaptation of Feature Detectors
- Hinton, G.E.¹ Srivastava, N.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.R.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.