메뉴 건너뛰기




Volumn E97-D, Issue 6, 2014, Pages 1403-1410

Voice conversion based on speaker-dependent restricted boltzmann machines

Author keywords

Deep learning; Restricted boltzmann machine; Speaker individuality; Voice conversion

Indexed keywords

ABSTRACTING; SPEECH COMMUNICATION;

EID: 84901766069     PISSN: 09168532     EISSN: 17451361     Source Type: Journal    
DOI: 10.1587/transinf.E97.D.1403     Document Type: Article
Times cited : (43)

References (39)
  • 2
    • 84865747520 scopus 로고    scopus 로고
    • Intonation conversion from neutral to expressive speech
    • C. Veaux and X. Robet, "Intonation conversion from neutral to expressive speech, " Proc. Interspeech, pp.2765-2768, 2011.
    • (2011) Proc. Interspeech , pp. 2765-2768
    • Veaux, C.1    Robet, X.2
  • 3
    • 80052698826 scopus 로고    scopus 로고
    • Speakingaid systems using gmm-based voice conversion for electrolaryngeal speech
    • K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Speakingaid systems using gmm-based voice conversion for electrolaryngeal speech, " Speech Commun., vol.54, no.1, pp.134-146, 2012.
    • (2012) Speech Commun. , vol.54 , Issue.1 , pp. 134-146
    • Nakamura, K.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 5
    • 70450192197 scopus 로고    scopus 로고
    • Speech generation from hand gestures based on space mapping
    • A. Kunikoshi, Y. Qiao, N. Minematsu, and K. Hirose, "Speech generation from hand gestures based on space mapping, " Proc. Interspeech, pp.308-311, 2009.
    • (2009) Proc. Interspeech , pp. 308-311
    • Kunikoshi, A.1    Qiao, Y.2    Minematsu, N.3    Hirose, K.4
  • 6
    • 0021412027 scopus 로고
    • Vector quantization
    • R. Gray, "Vector quantization, " IEEE ASSP Mag., vol.1, no.2, pp.4- 29, 1984.
    • (1984) IEEE ASSP Mag. , vol.1 , Issue.2 , pp. 4-29
    • Gray, R.1
  • 7
    • 0026880275 scopus 로고
    • Voice transformation using psola technique
    • H. Valbret, E. Moulines, and J.P. Tubach, "Voice transformation using psola technique, " Speech Commun., vol.11, no.2, pp.175-187, 1992.
    • (1992) Speech Commun. , vol.11 , Issue.2 , pp. 175-187
    • Valbret, H.1    Moulines, E.2    Tubach, J.P.3
  • 8
    • 0032026483 scopus 로고    scopus 로고
    • Continuous probabilistic transform for voice conversion
    • Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion, " IEEE Trans. Speech Audio Process., vol.6, no.2, pp.131-142, 1998.
    • (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.2 , pp. 131-142
    • Stylianou, Y.1    Cappe, O.2    Moulines, E.3
  • 9
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • T. Toda, A.W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " IEEE Trans. Audio Speech Language Process., vol.15, no.8, pp.2222-2235, 2007.
    • (2007) IEEE Trans. Audio Speech Language Process. , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 11
    • 44949210554 scopus 로고    scopus 로고
    • Map-based adaptation for speech conversion using adaptation data selection and non-parallel training
    • C.H. Lee and C.H. Wu, "Map-based adaptation for speech conversion using adaptation data selection and non-parallel training, " Proc. Interspeech, pp.2254-2257, 2006.
    • (2006) Proc. Interspeech , pp. 2254-2257
    • Lee, C.H.1    Wu, C.H.2
  • 12
    • 34547512822 scopus 로고    scopus 로고
    • Eigenvoice conversion based on gaussian mixture model
    • T. Toda, Y. Ohtani, and K. Shikano, "Eigenvoice conversion based on gaussian mixture model, " Proc. Interspeech, pp.2446-2449, 2006.
    • (2006) Proc. Interspeech , pp. 2446-2449
    • Toda, T.1    Ohtani, Y.2    Shikano, K.3
  • 13
    • 84865798483 scopus 로고    scopus 로고
    • One-tomany voice conversion based on tensor representation of speaker space
    • D. Saito, K. Yamamoto, N. Minematsu, and K. Hirose, "One-tomany voice conversion based on tensor representation of speaker space, " Proc. Interspeech, pp.653-656, 2011.
    • (2011) Proc. Interspeech , pp. 653-656
    • Saito, D.1    Yamamoto, K.2    Minematsu, N.3    Hirose, K.4
  • 14
    • 79959834571 scopus 로고    scopus 로고
    • Probabilistic integration of joint density model and speaker model for voice conversion
    • D. Saito, S. Watanabe, A. Nakamura, and N. Minematsu, "Probabilistic integration of joint density model and speaker model for voice conversion, " Proc. Interspeech, pp.1728-1731, 2010.
    • (2010) Proc. Interspeech , pp. 1728-1731
    • Saito, D.1    Watanabe, S.2    Nakamura, A.3    Minematsu, N.4
  • 19
    • 34547522070 scopus 로고    scopus 로고
    • Discriminative training for large-vocabulary speech recognition using minimum classification error
    • E. McDermott, T.J. Hazen, J. Le Roux, A. Nakamura, and S. Katagiri, "Discriminative training for large-vocabulary speech recognition using minimum classification error, " IEEE Trans. Audio Speech Language Process., vol.15, no.1, pp.203-223, 2007.
    • (2007) IEEE Trans. Audio Speech Language Process. , vol.15 , Issue.1 , pp. 203-223
    • McDermott, E.1    Hazen, T.J.2    Roux, J.L.3    Nakamura, A.4    Katagiri, S.5
  • 20
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for hmm-based speech synthesis
    • May
    • T. Tomoki and K. Tokuda, "A speech parameter generation algorithm considering global variance for hmm-based speech synthesis, " IEICE Trans. Inf. & Syst., vol.E90-D, no.5, pp.816-824, May 2007.
    • (2007) IEICE Trans. Inf. & Syst. , vol.E90-D , Issue.5 , pp. 816-824
    • Tomoki, T.1    Tokuda, K.2
  • 21
    • 84901793334 scopus 로고    scopus 로고
    • Minimum kullback-leibler divergence parameter generation for hmm-based speech synthesis
    • Z.H. Ling and L.R. Dai, "Minimum kullback-leibler divergence parameter generation for hmm-based speech synthesis, " IEEE Trans. Audio Speech Language Process., vol.20, no.5, pp.1492-1502, 2012.
    • (2012) IEEE Trans. Audio Speech Language Process , vol.20 , Issue.5 , pp. 1492-1502
    • Ling, Z.H.1    Dai, L.R.2
  • 25
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G.E. Hinton, S. Osindero, and Y.W. Teh, "A fast learning algorithm for deep belief nets, " Neural computation, vol.18, no.7, pp.1527- 1554, 2006.
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.W.3
  • 26
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Z.H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis, " IEEE Trans. Audio Speech Language Process., no.10, pp.2129-2139, 2013.
    • (2013) IEEE Trans. Audio Speech Language Process. , Issue.10 , pp. 2129-2139
    • Ling, Z.H.1    Deng, L.2    Yu, D.3
  • 30
    • 84906280857 scopus 로고    scopus 로고
    • Voice conversion in high-order eigen space using deep belief nets
    • T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki, "Voice conversion in high-order eigen space using deep belief nets, " Proc. Interspeech, pp.369-372, 2013.
    • (2013) Proc. Interspeech , pp. 369-372
    • Nakashika, T.1    Takashima, R.2    Takiguchi, T.3    Ariki, Y.4
  • 32
    • 84906225084 scopus 로고    scopus 로고
    • Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
    • C. Ling-Hui, L. Zhen-Hua, S. Yan, and D. Li-Rong, "Joint spectral distribution modeling using restricted boltzmann machines for voice conversion, " Proc. Interspeech, pp.3052-3056, 2013.
    • (2013) Proc. Interspeech , pp. 3052-3056
    • Ling-Hui, C.1    Zhen-Hua, L.2    Yan, S.3    Li-Rong, D.4
  • 35
    • 85039958911 scopus 로고    scopus 로고
    • Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model
    • B. Milner and X. Shao, "Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model., " Proc. Interspeech, 2002.
    • (2002) Proc. Interspeech
    • Milner, B.1    Shao, X.2
  • 37
    • 78650474133 scopus 로고    scopus 로고
    • A practical guide to training restricted boltzmann machines
    • University of Toronto
    • G. Hinton, "A practical guide to training restricted boltzmann machines, " Tech. Rep. Department of Computer Science, University of Toronto, 2010.
    • (2010) Tech. Rep. Department of Computer Science
    • Hinton, G.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.