메뉴 건너뛰기




Volumn 2015-January, Issue , 2015, Pages 284-288

Semi-supervised training of a voice conversion mapping function using a joint-autoencoder

Author keywords

Deep neural network; Pre training; Semisupervised learning; Voice conversion

Indexed keywords

BACKPROPAGATION; ENCODING (SYMBOLS); LEARNING SYSTEMS; MAPPING; NETWORK ARCHITECTURE; SPEECH COMMUNICATION; SPEECH RECOGNITION;

EID: 84959173289     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (10)

References (29)
  • 4
    • 84906281619 scopus 로고    scopus 로고
    • Real-time voice conversion using artificial neural networks with rectified linear units
    • E. Azarov, M. Vashkevich, D. Likhachov, and A. Petrovsky, "Real-time voice conversion using artificial neural networks with rectified linear units, " in INTERSPEECH, 2013, pp. 1032-1036.
    • (2013) INTERSPEECH , pp. 1032-1036
    • Azarov, E.1    Vashkevich, M.2    Likhachov, D.3    Petrovsky, A.4
  • 6
    • 84905573362 scopus 로고    scopus 로고
    • Voice conversion using general regression neural network
    • J. Nirmal, M. Zaveri, S. Patnaik, and P. Kachare, "Voice conversion using general regression neural network, " Applied Soft Computing, vol. 24, pp. 1-12, 2014.
    • (2014) Applied Soft Computing , vol.24 , pp. 1-12
    • Nirmal, J.1    Zaveri, M.2    Patnaik, S.3    Kachare, P.4
  • 7
    • 84906225084 scopus 로고    scopus 로고
    • Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
    • L. H. Chen, Z. H. Ling, Y. Song, and L. R. Dai, "Joint spectral distribution modeling using restricted boltzmann machines for voice conversion, " in INTERSPEECH, 2013.
    • (2013) INTERSPEECH
    • Chen, L.H.1    Ling, Z.H.2    Song, Y.3    Dai, L.R.4
  • 9
    • 84906280857 scopus 로고    scopus 로고
    • Voice conversion in high-order eigen space using deep belief nets
    • T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki, "Voice conversion in high-order eigen space using deep belief nets, " in INTERSPEECH, 2013, pp. 369-372.
    • (2013) INTERSPEECH , pp. 369-372
    • Nakashika, T.1    Takashima, R.2    Takiguchi, T.3    Ariki, Y.4
  • 13
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • November
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " IEEE Transactions on Audio, Speech, and Language Processing Journal, vol. 15, no. 8, pp. 2222-2235, November 2007.
    • (2007) IEEE Transactions on Audio, Speech, and Language Processing Journal , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 14
    • 84910087395 scopus 로고    scopus 로고
    • Sequence error (se) minimization training of neural network for voice conversion
    • F.-L. Xie, Y. Qian, Y. Fan, F. K. Soong, and H. Li, "Sequence error (se) minimization training of neural network for voice conversion, " in Proc. Interspeech, 2014.
    • (2014) Proc. Interspeech
    • Xie, F.-L.1    Qian, Y.2    Fan, Y.3    Soong, F.K.4    Li, H.5
  • 15
    • 84910087396 scopus 로고    scopus 로고
    • High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion
    • T. Nakashika, T. Takiguchi, and Y. Ariki, "High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion, " in Interspeech, 2014.
    • (2014) Interspeech
    • Nakashika, T.1    Takiguchi, T.2    Ariki, Y.3
  • 18
    • 84910104946 scopus 로고    scopus 로고
    • Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes
    • L.-H. Chen, Z.-H. Ling, and L.-R. Dai, "Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes, " in Proc. Interspeech, 2014.
    • (2014) Proc. Interspeech
    • Chen, L.-H.1    Ling, Z.-H.2    Dai, L.-R.3
  • 19
    • 84946685887 scopus 로고    scopus 로고
    • Voice conversion using deep neural networks with speaker-independent pretraining
    • S. H. Mohammadi and A. Kain, "Voice conversion using deep neural networks with speaker-independent pretraining, " in Spoken Language Technology (SLT). IEEE, 2014.
    • (2014) Spoken Language Technology (SLT). IEEE
    • Mohammadi, S.H.1    Kain, A.2
  • 20
    • 80455143732 scopus 로고    scopus 로고
    • Learning speaker-specific characteristics with a deep neural architecture
    • K. Chen and A. Salman, "Learning speaker-specific characteristics with a deep neural architecture, " Neural Networks, IEEE Transactions on, vol. 22, no. 11, pp. 1744-1756, 2011.
    • (2011) Neural Networks, IEEE Transactions on , vol.22 , Issue.11 , pp. 1744-1756
    • Chen, K.1    Salman, A.2
  • 23
    • 79551480483 scopus 로고    scopus 로고
    • Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
    • P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, " The Journal of Machine Learning Research, vol. 11, pp. 3371-3408, 2010.
    • (2010) The Journal of Machine Learning Research , vol.11 , pp. 3371-3408
    • Vincent, P.1    Larochelle, H.2    Lajoie, I.3    Bengio, Y.4    Manzagol, P.-A.5
  • 24
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 8, pp. 2222-2235, 2007.
    • (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 26
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-to-speech synthesis
    • May
    • A. Kain and M. Macon, "Spectral voice conversion for text-to-speech synthesis, " in Proceedings of ICASSP, vol. 1, May 1998, pp. 285-299.
    • (1998) Proceedings of ICASSP , vol.1 , pp. 285-299
    • Kain, A.1    Macon, M.2
  • 27
    • 0029209272 scopus 로고
    • Robust test-independent speaker identification using Gaussian mixture models
    • D. A. Reynolds and R. C. Rose, "Robust test-independent speaker identification using Gaussian mixture models, " IEEE Transactions on Speech and Audio Processing, vol. 3, pp. 72-83, 1995.
    • (1995) IEEE Transactions on Speech and Audio Processing , vol.3 , pp. 72-83
    • Reynolds, D.A.1    Rose, R.C.2
  • 28
    • 4444285698 scopus 로고    scopus 로고
    • Ph. D. dissertation, OGI School of Science & Engineering at Oregon Health & Science University
    • A. Kain, "High Resolution Voice Transformation, " Ph. D. dissertation, OGI School of Science & Engineering at Oregon Health & Science University, 2001.
    • (2001) High Resolution Voice Transformation
    • Kain, A.1
  • 29
    • 0002322469 scopus 로고
    • On a test of whether one of two random variables is stochastically larger than the other
    • H. B. Mann and D. R. Whitney, "On a test of whether one of two random variables is stochastically larger than the other, " The annals of mathematical statistics, pp. 50-60, 1947.
    • (1947) The Annals of Mathematical Statistics , pp. 50-60
    • Mann, H.B.1    Whitney, D.R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.