메뉴 건너뛰기




Volumn , Issue , 2014, Pages 2313-2317

Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes

Author keywords

Deep neural network; Spectral envelope; Voice conversion

Indexed keywords

ASSOCIATIVE PROCESSING; COMPLEX NETWORKS; SPEECH COMMUNICATION; SPEECH PROCESSING;

EID: 84910104946     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (10)

References (22)
  • 1
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-tospeech synthesis
    • A. Kain and M. Macon, "Spectral voice conversion for text-tospeech synthesis, " in Proc. ICASSP, 1998, pp. 285-288.
    • (1998) Proc. ICASSP , pp. 285-288
    • Kain, A.1    Macon, M.2
  • 2
    • 84905560807 scopus 로고    scopus 로고
    • Voice conversion with smoothedGMMand MAP adaptation
    • Y. Chen, M. Chu, E. Chang, J. Liu, and R. Liu, "Voice conversion with smoothedGMMand MAP adaptation, " in Eurospeech, 2003, pp. 2413-2416.
    • (2003) Eurospeech , pp. 2413-2416
    • Chen, Y.1    Chu, M.2    Chang, E.3    Liu, J.4    Liu, R.5
  • 4
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • nov
    • T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 8, pp. 2222 -2235, nov. 2007.
    • (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.2    Tokuda, K.3
  • 6
    • 84906280857 scopus 로고    scopus 로고
    • Voice conversion in high-order eigen space using deep belief nets
    • T. Nakashika, T. Takashima, R. Takiguchi, and Y. Ariki, "Voice conversion in high-order eigen space using deep belief nets, " in Proc. Interspeech, 2013, pp. 369-372.
    • (2013) Proc. Interspeech , pp. 369-372
    • Nakashika, T.1    Takashima, T.2    Takiguchi, R.3    Ariki, Y.4
  • 8
    • 84906225084 scopus 로고    scopus 로고
    • Joint spectral distribution modeling using restricted Boltzmann machines for voice conversion
    • L.-H. Chen, Z.-H. Ling, Y. Song, and L.-R. Dai, "Joint spectral distribution modeling using restricted Boltzmann machines for voice conversion, " in Interspeech, 2013, pp. 3052-3056.
    • (2013) Interspeech , pp. 3052-3056
    • Chen, L.-H.1    Ling, Z.-H.2    Song, Y.3    Dai, L.-R.4
  • 9
    • 84905223323 scopus 로고    scopus 로고
    • Using bidirectional associative memories for joint spectral envelope modeling in voice conversion
    • L.-J. Liu, L.-H. Chen, Z.-H. Ling, and L.-R. Dai, "Using bidirectional associative memories for joint spectral envelope modeling in voice conversion, " in Proc. ICASSP, 2014.
    • (2014) Proc. ICASSP
    • Liu, L.-J.1    Chen, L.-H.2    Ling, Z.-H.3    Dai, L.-R.4
  • 10
    • 0000329993 scopus 로고
    • Information processing in dynamical systems: Foundations of harmony theory
    • D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA, USA: MIT Press, ch. 6
    • P. Smolensky, "Information processing in dynamical systems: foundations of harmony theory, " in Parallel distributed processing: explorations in the microstructure of cognition, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA, USA: MIT Press, 1986, vol. 1, ch. 6, pp. 194-281.
    • (1986) Parallel Distributed Processing: Explorations in the Microstructure of Cognition , vol.1 , pp. 194-281
    • Smolensky, P.1
  • 12
    • 0013344078 scopus 로고    scopus 로고
    • Training products of experts by minimizing contrastive divergence
    • G. Hinton, "Training products of experts by minimizing contrastive divergence, " Neural Computation, vol. 12, no. 14, pp. 1711-1800, 2002.
    • (2002) Neural Computation , vol.12 , Issue.14 , pp. 1711-1800
    • Hinton, G.1
  • 13
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for hmm-based speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for hmm-based speech synthesis, " in Proc. ICASSP, vol. 3, 2000, pp. 1315 -1318.
    • (2000) Proc. ICASSP , vol.3 , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 14
    • 84890447002 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis
    • Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis, " in ICASSP, 2013, pp. 7825-7829.
    • (2013) ICASSP , pp. 7825-7829
    • Ling, Z.-H.1    Deng, L.2    Yu, D.3
  • 16
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 21, no. 10, pp. 2129-2139, 2013.
    • (2013) Audio, Speech, and Language Processing, IEEE Transactions on , vol.21 , Issue.10 , pp. 2129-2139
    • Ling, Z.-H.1    Deng, L.2    Yu, D.3
  • 18
    • 84867585919 scopus 로고    scopus 로고
    • Understanding how deep belief networks perform acoustic modelling
    • March
    • A. Mohamed, G. Hinton, and G. Penn, "Understanding how deep belief networks perform acoustic modelling, " in ICASSP, March 2012, pp. 4273-4276.
    • (2012) ICASSP , pp. 4273-4276
    • Mohamed, A.1    Hinton, G.2    Penn, G.3
  • 19
    • 84874485803 scopus 로고    scopus 로고
    • Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling
    • Dec
    • J. Pan, C. Liu, Z. Wang, Y. Hu, and H. Jiang, "Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling, " in Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on, Dec 2012, pp. 301-305.
    • (2012) Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on , pp. 301-305
    • Pan, J.1    Liu, C.2    Wang, Z.3    Hu, Y.4    Jiang, H.5
  • 20
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech Communication, vol. 27, no. 3, pp. 187-208, 1999.
    • (1999) Speech Communication , vol.27 , Issue.3 , pp. 187-208
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigne, A.3
  • 21
    • 84878387361 scopus 로고    scopus 로고
    • PLDA using Gaussian restricted Boltzmann machines with application to speaker verification
    • T. Stafylakis, P. Kenny, M. Senoussaoui, and P. Dumouchel, "PLDA using Gaussian restricted Boltzmann machines with application to speaker verification." in Inter speech, 2012.
    • (2012) Inter Speech
    • Stafylakis, T.1    Kenny, P.2    Senoussaoui, M.3    Dumouchel, P.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.