SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 284-288

Semi-supervised training of a voice conversion mapping function using a joint-autoencoder

(2) Mohammadi, Seyed Hamidreza a Kain, Alexander a

a OREGON HEALTH AND SCIENCE UNIVERSITY (United States)

Author keywords

Deep neural network; Pre training; Semisupervised learning; Voice conversion

Indexed keywords

BACKPROPAGATION; ENCODING (SYMBOLS); LEARNING SYSTEMS; MAPPING; NETWORK ARCHITECTURE; SPEECH COMMUNICATION; SPEECH RECOGNITION;

DEEP NEURAL NETWORKS; GENERAL STRUCTURES; PRE-TRAINING; SEMI- SUPERVISED LEARNING; SEMI-SUPERVISED TRAININGS; SPEAKER CONVERSION; SYSTEM CONFIGURATIONS; VOICE CONVERSION;

SPEECH PROCESSING;

EID: 84959173289 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (10)

References (29)

1
- 84890475857
- Transmutative voice conversion
- S. H. Mohammadi and A. Kain, "Transmutative voice conversion, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 6920-6924.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On. IEEE , pp. 6920-6924
- Mohammadi, S.H.¹ Kain, A.²

2
- 77953725318
- Inca algorithm for training voice conversion systems from nonparallel corpora
- D. Erro, A. Moreno, and A. Bonafonte, "Inca algorithm for training voice conversion systems from nonparallel corpora, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 18, no. 5, pp. 944-953, 2010.
- (2010) Audio, Speech, and Language Processing, IEEE Transactions on , vol.18 , Issue.5 , pp. 944-953
- Erro, D.¹ Moreno, A.² Bonafonte, A.³

3
- 77953707533
- Spectral mapping using artificial neural networks for voice conversion
- S. Desai, A. W. Black, B. Yegnanarayana, and K. Prahallad, "Spectral mapping using artificial neural networks for voice conversion, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 18, no. 5, pp. 954-964, 2010.
- (2010) Audio, Speech, and Language Processing, IEEE Transactions on , vol.18 , Issue.5 , pp. 954-964
- Desai, S.¹ Black, A.W.² Yegnanarayana, B.³ Prahallad, K.⁴

4
- 84906281619
- Real-time voice conversion using artificial neural networks with rectified linear units
- E. Azarov, M. Vashkevich, D. Likhachov, and A. Petrovsky, "Real-time voice conversion using artificial neural networks with rectified linear units, " in INTERSPEECH, 2013, pp. 1032-1036.
- (2013) INTERSPEECH , pp. 1032-1036
- Azarov, E.¹ Vashkevich, M.² Likhachov, D.³ Petrovsky, A.⁴

5
- 84905223323
- Using bidirectional associative memories for joint spectral envelope modeling in voice conversion
- L. J. Liu, L. H. Chen, Z. H. Ling, and L. R. Dai, "Using bidirectional associative memories for joint spectral envelope modeling in voice conversion, " in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014.
- (2014) Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference On. IEEE
- Liu, L.J.¹ Chen, L.H.² Ling, Z.H.³ Dai, L.R.⁴

6
- 84905573362
- Voice conversion using general regression neural network
- J. Nirmal, M. Zaveri, S. Patnaik, and P. Kachare, "Voice conversion using general regression neural network, " Applied Soft Computing, vol. 24, pp. 1-12, 2014.
- (2014) Applied Soft Computing , vol.24 , pp. 1-12
- Nirmal, J.¹ Zaveri, M.² Patnaik, S.³ Kachare, P.⁴

7
- 84906225084
- Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
- L. H. Chen, Z. H. Ling, Y. Song, and L. R. Dai, "Joint spectral distribution modeling using restricted boltzmann machines for voice conversion, " in INTERSPEECH, 2013.
- (2013) INTERSPEECH
- Chen, L.H.¹ Ling, Z.H.² Song, Y.³ Dai, L.R.⁴

8
- 84889579519
- Conditional restricted boltzmann machine for voice conversion
- Z. Wu, E. S. Chng, and H. Li, "Conditional restricted boltzmann machine for voice conversion, " in Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit & International Conference on. IEEE, 2013, pp. 104-108.
- (2013) Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit & International Conference On. IEEE , pp. 104-108
- Wu, Z.¹ Chng, E.S.² Li, H.³

9
- 84906280857
- Voice conversion in high-order eigen space using deep belief nets
- T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki, "Voice conversion in high-order eigen space using deep belief nets, " in INTERSPEECH, 2013, pp. 369-372.
- (2013) INTERSPEECH , pp. 369-372
- Nakashika, T.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

10
- 84921735339
- Voice conversion using deep neural networks with layer-wise generative training
- L.-H. Chen, Z.-H. Ling, L.-J. Liu, and L.-R. Dai, "Voice conversion using deep neural networks with layer-wise generative training, " IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 22, no. 12, pp. 1859-1872, 2014.
- (2014) IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) , vol.22 , Issue.12 , pp. 1859-1872
- Chen, L.-H.¹ Ling, Z.-H.² Liu, L.-J.³ Dai, L.-R.⁴

11
- 84994241109
- Including dynamic and phonetic information in voice conversion systems
- H. Duxans, A. Bonafonte, A. Kain, and J. Van Santen, "Including dynamic and phonetic information in voice conversion systems, " in Proc. of the ICSLP'04, 2004.
- (2004) Proc. of the ICSLP'04
- Duxans, H.¹ Bonafonte, A.² Kain, A.³ Van Santen, J.⁴

12
- 84878384703
- Making conversational vowels more clear
- S. H. Mohammadi, A. Kain, and J. P. van Santen, "Making conversational vowels more clear. " in Interspeech, 2012.
- (2012) Interspeech
- Mohammadi, S.H.¹ Kain, A.² Van Santen, J.P.³

13
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- November
- T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " IEEE Transactions on Audio, Speech, and Language Processing Journal, vol. 15, no. 8, pp. 2222-2235, November 2007.
- (2007) IEEE Transactions on Audio, Speech, and Language Processing Journal , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

14
- 84910087395
- Sequence error (se) minimization training of neural network for voice conversion
- F.-L. Xie, Y. Qian, Y. Fan, F. K. Soong, and H. Li, "Sequence error (se) minimization training of neural network for voice conversion, " in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Xie, F.-L.¹ Qian, Y.² Fan, Y.³ Soong, F.K.⁴ Li, H.⁵

15
- 84910087396
- High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion
- T. Nakashika, T. Takiguchi, and Y. Ariki, "High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion, " in Interspeech, 2014.
- (2014) Interspeech
- Nakashika, T.¹ Takiguchi, T.² Ariki, Y.³

16
- 84946027999
- Voice conversion using deep bidirectional long short-term memory bsed recurrent neural networks
- L. Sun, S. Kang, K. Li, and H. Meng, "Voice conversion using deep bidirectional long short-term memory bsed recurrent neural networks, " in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015.
- (2015) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Sun, L.¹ Kang, S.² Li, K.³ Meng, H.⁴

17
- 84923867813
- Voice conversion using rnn pre-trained by recurrent temporal restricted boltzmann machines
- March
- T. Nakashika, T. Takiguchi, and Y. Ariki, "Voice conversion using rnn pre-trained by recurrent temporal restricted boltzmann machines, " Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 23, no. 3, pp. 580-587, March 2015.
- (2015) Audio, Speech, and Language Processing, IEEE/ACM Transactions on , vol.23 , Issue.3 , pp. 580-587
- Nakashika, T.¹ Takiguchi, T.² Ariki, Y.³

18
- 84910104946
- Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes
- L.-H. Chen, Z.-H. Ling, and L.-R. Dai, "Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes, " in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Chen, L.-H.¹ Ling, Z.-H.² Dai, L.-R.³

19
- 84946685887
- Voice conversion using deep neural networks with speaker-independent pretraining
- S. H. Mohammadi and A. Kain, "Voice conversion using deep neural networks with speaker-independent pretraining, " in Spoken Language Technology (SLT). IEEE, 2014.
- (2014) Spoken Language Technology (SLT). IEEE
- Mohammadi, S.H.¹ Kain, A.²

20
- 80455143732
- Learning speaker-specific characteristics with a deep neural architecture
- K. Chen and A. Salman, "Learning speaker-specific characteristics with a deep neural architecture, " Neural Networks, IEEE Transactions on, vol. 22, no. 11, pp. 1744-1756, 2011.
- (2011) Neural Networks, IEEE Transactions on , vol.22 , Issue.11 , pp. 1744-1756
- Chen, K.¹ Salman, A.²

21
- 84912544456
- Inferring social contexts from audio recordings using deep neural networks
- M. Asgari, I. Shafran, and A. Bayestehtashk, "Inferring social contexts from audio recordings using deep neural networks, " in Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on. IEEE, 2014, pp. 1-6.
- (2014) Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop On. IEEE , pp. 1-6
- Asgari, M.¹ Shafran, I.² Bayestehtashk, A.³

22
- 84862277874
- Understanding the difficulty of training deep feedforward neural networks
- X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks, " in International conference on artificial intelligence and statistics, 2010, pp. 249-256.
- (2010) International Conference on Artificial Intelligence and Statistics , pp. 249-256
- Glorot, X.¹ Bengio, Y.²

23
- 79551480483
- Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
- P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, " The Journal of Machine Learning Research, vol. 11, pp. 3371-3408, 2010.
- (2010) The Journal of Machine Learning Research , vol.11 , pp. 3371-3408
- Vincent, P.¹ Larochelle, H.² Lajoie, I.³ Bengio, Y.⁴ Manzagol, P.-A.⁵

24
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

25
- 84055211743
- Acoustic modeling using deep belief networks
- A. R. Mohamed, G. E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.R.¹ Dahl, G.E.² Hinton, G.³

26
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- May
- A. Kain and M. Macon, "Spectral voice conversion for text-to-speech synthesis, " in Proceedings of ICASSP, vol. 1, May 1998, pp. 285-299.
- (1998) Proceedings of ICASSP , vol.1 , pp. 285-299
- Kain, A.¹ Macon, M.²

27
- 0029209272
- Robust test-independent speaker identification using Gaussian mixture models
- D. A. Reynolds and R. C. Rose, "Robust test-independent speaker identification using Gaussian mixture models, " IEEE Transactions on Speech and Audio Processing, vol. 3, pp. 72-83, 1995.
- (1995) IEEE Transactions on Speech and Audio Processing , vol.3 , pp. 72-83
- Reynolds, D.A.¹ Rose, R.C.²

28
- 4444285698
- Ph. D. dissertation, OGI School of Science & Engineering at Oregon Health & Science University
- A. Kain, "High Resolution Voice Transformation, " Ph. D. dissertation, OGI School of Science & Engineering at Oregon Health & Science University, 2001.
- (2001) High Resolution Voice Transformation
- Kain, A.¹

29
- 0002322469
- On a test of whether one of two random variables is stochastically larger than the other
- H. B. Mann and D. R. Whitney, "On a test of whether one of two random variables is stochastically larger than the other, " The annals of mathematical statistics, pp. 50-60, 1947.
- (1947) The Annals of Mathematical Statistics , pp. 50-60
- Mann, H.B.¹ Whitney, D.R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.