메뉴 건너뛰기




Volumn 08-12-September-2016, Issue , 2016, Pages 1662-1666

ML parameter generation with a reformulated MGE training criterion-participation in the voice conversion challenge 2016

Author keywords

Cepstral postfilter; Linear regression; Maximum likelihood parameter generation; Minimum generation error; Voice conversion

Indexed keywords

LINEAR REGRESSION; MATHEMATICAL TRANSFORMATIONS; MAXIMUM LIKELIHOOD; SPEECH COMMUNICATION;

EID: 84994385904     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: 10.21437/Interspeech.2016-219     Document Type: Conference Paper
Times cited : (3)

References (38)
  • 2
    • 0033154052 scopus 로고    scopus 로고
    • Speaker transformation algorithm using segmental codebooks (STASC)
    • L. M. Arslan, "Speaker transformation algorithm using segmental codebooks (STASC)," Speech Commun., vol. 28, no. 3, pp. 211-226, 1999.
    • (1999) Speech Commun , vol.28 , Issue.3 , pp. 211-226
    • Arslan, L.M.1
  • 3
    • 84994241109 scopus 로고    scopus 로고
    • Including dynamic and phonetic information in voice conversion systems
    • H. Duxans, A. Bonafonte, A. Kain, and J. P. H. van Santen, "Including dynamic and phonetic information in voice conversion systems," in Proc. Interspeech, 2004, pp. 1193-1196.
    • (2004) Proc. Interspeech , pp. 1193-1196
    • Duxans, H.1    Bonafonte, A.2    Kain, A.3    Van Santen, J.P.H.4
  • 4
    • 78049373493 scopus 로고    scopus 로고
    • Pronunciation variation generation for spontaneous speech synthesis using state-based voice transformation
    • C. H. Lee, C. H.Wu, and J. C. Guo, "Pronunciation variation generation for spontaneous speech synthesis using state-based voice transformation," in Proc. ICASSP, 2010, pp. 4826-4829.
    • (2010) Proc. ICASSP , pp. 4826-4829
    • Lee, C.H.1    Wu, H.C.2    Guo, J.C.3
  • 5
    • 78149260085 scopus 로고    scopus 로고
    • Continuous stochastic feature mapping based on trajectory HMMs
    • H. Zen, Y. Nankaku, and K. Tokuda, "Continuous stochastic feature mapping based on trajectory HMMs," IEEE Trans. Audio, Speech & Lang. Process., vol. 19, no. 2, pp. 417-430, 2011.
    • (2011) IEEE Trans. Audio, Speech & Lang. Process , vol.19 , Issue.2 , pp. 417-430
    • Zen, H.1    Nankaku, Y.2    Tokuda, K.3
  • 6
    • 0032026483 scopus 로고    scopus 로고
    • Continuous probabilistic transform for voice conversion
    • Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech & Audio Process., vol. 6, no. 2, pp. 131-142, 1998.
    • (1998) IEEE Trans. Speech & Audio Process , vol.6 , Issue.2 , pp. 131-142
    • Stylianou, Y.1    Cappé, O.2    Moulines, E.3
  • 7
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-tospeech synthesis
    • A. Kain and M.W. Macon, "Spectral voice conversion for text-tospeech synthesis," in Proc. ICASSP, 1998, pp. 285-288.
    • (1998) Proc. ICASSP , pp. 285-288
    • Kain, A.1    Macon, M.W.2
  • 8
    • 34047254509 scopus 로고    scopus 로고
    • Quality-enhanced voice morphing using maximum likelihood transformations
    • H. Ye and S. J. Young, "Quality-enhanced voice morphing using maximum likelihood transformations," IEEE Trans. Audio, Speech, & Lang. Process., vol. 14, no. 4, pp. 1301-1312, 2006.
    • (2006) IEEE Trans. Audio, Speech, & Lang. Process , vol.14 , Issue.4 , pp. 1301-1312
    • Ye, H.1    Young, S.J.2
  • 9
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, & Lang. Process., vol. 15, no. 8, pp. 2222-2235, 2007.
    • (2007) IEEE Trans. Audio, Speech, & Lang. Process , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.2    Tokuda, K.3
  • 11
    • 84865754815 scopus 로고    scopus 로고
    • Voice conversion using GMM with enhanced global variance
    • H. Benisty and D. Malah, "Voice conversion using GMM with enhanced global variance," in Proc. Interspeech, 2011, pp. 669-672.
    • (2011) Proc. Interspeech , pp. 669-672
    • Benisty, H.1    Malah, D.2
  • 12
    • 84890539284 scopus 로고    scopus 로고
    • Voice conversion based on Gaussian processes by coherent and asymmetric training with limited training data
    • N. Xu, Y. Tang, J. Bao, A. Jiang, X. Liu, and Z. Yang, "Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data," Speech Commun., vol. 58, pp. 124-138, 2014.
    • (2014) Speech Commun , vol.58 , pp. 124-138
    • Xu, N.1    Tang, Y.2    Bao, J.3    Jiang, A.4    Liu, X.5    Yang, Z.6
  • 13
    • 0029254176 scopus 로고
    • Transformation of formants for voice conversion using artificial neural networks
    • M. Narendranath, H. A. Murthy, S. Rajendran, and B. Yegnanarayana, "Transformation of formants for voice conversion using artificial neural networks," Speech Commun., vol. 16, no. 2, pp. 207-216, 1995.
    • (1995) Speech Commun , vol.16 , Issue.2 , pp. 207-216
    • Narendranath, M.1    Murthy, H.A.2    Rajendran, S.3    Yegnanarayana, B.4
  • 15
    • 84946685887 scopus 로고    scopus 로고
    • Voice conversion using deep neural networks with speaker-independent pre-training
    • S. H. Mohammadi and A. Kain, "Voice conversion using deep neural networks with speaker-independent pre-training," in Proc. IEEE SLT, 2014, pp. 19-23.
    • (2014) Proc. IEEE SLT , pp. 19-23
    • Mohammadi, S.H.1    Kain, A.2
  • 16
    • 85032750981 scopus 로고    scopus 로고
    • Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
    • Z.-H. Ling, S.-Y. Kang, H. Zen, A. Senior, M. Schuster, X.-J. Qian, H. Meng, and L. Deng, "Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends," IEEE Signal Process. Mag., vol. 32, no. 3, pp. 35-52, 2015.
    • (2015) IEEE Signal Process. Mag , vol.32 , Issue.3 , pp. 35-52
    • Ling, Z.-H.1    Kang, S.-Y.2    Zen, H.3    Senior, A.4    Schuster, M.5    Qian, X.-J.6    Meng, H.7    Deng, L.8
  • 17
    • 33646900967 scopus 로고
    • Voice conversion based on piecewise linear conversion rules of formant frequency and spectrum tilt
    • H. Mizuno and M. Abe, "Voice conversion based on piecewise linear conversion rules of formant frequency and spectrum tilt," in Proc. ICASSP, 1994, pp. 469-472.
    • (1994) Proc. ICASSP , pp. 469-472
    • Mizuno, H.1    Abe, M.2
  • 18
    • 4544361661 scopus 로고    scopus 로고
    • Voice conversion through transformation of spectral and intonation features
    • D. Rentzos, S. Vaseghi, Q. Yan, and C.-H. Ho, "Voice conversion through transformation of spectral and intonation features," in Proc. ICASSP, 2004, pp. 21-24.
    • (2004) Proc. ICASSP , pp. 21-24
    • Rentzos, D.1    Vaseghi, S.2    Yan, Q.3    Ho, C.-H.4
  • 19
    • 0026880275 scopus 로고
    • Voice transformation using PSOLA technique
    • H. Valbret, E. Moulines, and J. Tubach, "Voice transformation using PSOLA technique," Speech Commun., vol. 11, no. 2-3, pp. 175-187, 1992.
    • (1992) Speech Commun , vol.11 , Issue.2-3 , pp. 175-187
    • Valbret, H.1    Moulines, E.2    Tubach, J.3
  • 20
    • 33745209887 scopus 로고    scopus 로고
    • Evaluation of VTLN-based voice conversion for embedded speech synthesis
    • D. Suendermann, G. Strecha, A. Bonafonte, H. Hoege, and H. Ney, "Evaluation of VTLN-based voice conversion for embedded speech synthesis," in Proc. Interspeech, 2005, pp. 2593-2596.
    • (2005) Proc. Interspeech , pp. 2593-2596
    • Suendermann, D.1    Strecha, G.2    Bonafonte, A.3    Hoege, H.4    Ney, H.5
  • 22
    • 80051619373 scopus 로고    scopus 로고
    • One sentence voice adaptation using gmm-based frequency-warping and shift with a sub-band basis spectrum model
    • M. Tamura, M. Morita, T. Kagoshima, and M. Akamine, "One sentence voice adaptation using gmm-based frequency-warping and shift with a sub-band basis spectrum model," in Proc. ICASSP, 2011, pp. 5124-5127.
    • (2011) Proc. ICASSP , pp. 5124-5127
    • Tamura, M.1    Morita, M.2    Kagoshima, T.3    Akamine, M.4
  • 23
    • 84857498745 scopus 로고    scopus 로고
    • Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
    • E. Godoy, O. Rosec, and T. Chonavel, "Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora," IEEE Trans. Audio, Speech & Lang. Process., vol. 20, no. 4, pp. 1313-1323, 2012.
    • (2012) IEEE Trans. Audio, Speech & Lang. Process , vol.20 , Issue.4 , pp. 1313-1323
    • Godoy, E.1    Rosec, O.2    Chonavel, T.3
  • 25
    • 84896464538 scopus 로고    scopus 로고
    • A unit selection approach for voice transformation
    • K.-S. Lee, "A unit selection approach for voice transformation," Speech Commun., vol. 60, pp. 30-43, 2014.
    • (2014) Speech Commun , vol.60 , pp. 30-43
    • Lee, K.-S.1
  • 26
    • 84911369131 scopus 로고    scopus 로고
    • Exemplar-based sparse representation with residual compensation for voice conversion
    • Z. Wu, T. Virtanen, E. Chng, and H. Li, "Exemplar-based sparse representation with residual compensation for voice conversion," IEEE/ACM Trans. Audio, Speech, & Lang. Process., vol. 22, no. 10, pp. 1506-1521, 2014.
    • (2014) IEEE/ACM Trans. Audio, Speech, & Lang. Process , vol.22 , Issue.10 , pp. 1506-1521
    • Wu, Z.1    Virtanen, T.2    Chng, E.3    Li, H.4
  • 27
    • 0034842552 scopus 로고    scopus 로고
    • Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum
    • T. Toda, H. Saruwatari, and K. Shikano, "Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum," in Proc. ICASSP, 2001, pp. 841-844.
    • (2001) Proc. ICASSP , pp. 841-844
    • Toda, T.1    Saruwatari, H.2    Shikano, K.3
  • 30
    • 51449112440 scopus 로고    scopus 로고
    • Voice conversion by combining frequency warping with unit selection
    • Z. Shuang, F. Meng, and Y. Qin, "Voice conversion by combining frequency warping with unit selection," in Proc. ICASSP, 2008, pp. 4661-4664.
    • (2008) Proc. ICASSP , pp. 4661-4664
    • Shuang, Z.1    Meng, F.2    Qin, Y.3
  • 32
    • 84872177757 scopus 로고    scopus 로고
    • Parametric voice conversion based on bilinear frequency warping plus amplitude scaling
    • D. Erro, E. Navas, and I. Hernaez, "Parametric voice conversion based on bilinear frequency warping plus amplitude scaling," IEEE Trans. Audio, Speech, & Lang. Process., vol. 21, no. 3, pp. 556-566, 2013.
    • (2013) IEEE Trans. Audio, Speech, & Lang. Process , vol.21 , Issue.3 , pp. 556-566
    • Erro, D.1    Navas, E.2    Hernaez, I.3
  • 34
    • 70450183499 scopus 로고    scopus 로고
    • An improved minimum generation error based model adaptation for HMM-based speech synthesis
    • Y. Wu, L. Qin, and K. Tokuda, "An improved minimum generation error based model adaptation for HMM-based speech synthesis," in Proc. Interspeech, 2009, pp. 1787-1790.
    • (2009) Proc. Interspeech , pp. 1787-1790
    • Wu, Y.1    Qin, L.2    Tokuda, K.3
  • 35
    • 84897865577 scopus 로고    scopus 로고
    • Harmonics plus noise model based vocoder for statistical parametric speech synthesis
    • D. Erro, I. Sainz, E. Navas, and I. Hernáez, "Harmonics plus noise model based vocoder for statistical parametric speech synthesis," IEEE Journal Sel. Topics in Signal Process., vol. 8, no. 2, pp. 184-194, 2014.
    • (2014) IEEE Journal Sel. Topics in Signal Process , vol.8 , Issue.2 , pp. 184-194
    • Erro, D.1    Sainz, I.2    Navas, E.3    Hernáez, I.4
  • 36
    • 84878409257 scopus 로고    scopus 로고
    • Iterative MMSE estimation of vocal tract length normalization factors for voice transformation
    • D. Erro, E. Navas, and I. Hernáez, "Iterative MMSE estimation of vocal tract length normalization factors for voice transformation," in Proc. Interspeech, 2012, pp. 86-89.
    • (2012) Proc. Interspeech , pp. 86-89
    • Erro, D.1    Navas, E.2    Hernáez, I.3
  • 37
    • 85009387557 scopus 로고    scopus 로고
    • Two-band radial postfiltering in cepstral domain with application to speech synthesis
    • D. Erro, "Two-band radial postfiltering in cepstral domain with application to speech synthesis," IEEE Signal Process. Lett., vol. 23, no. 2, pp. 202-206, 2016.
    • IEEE Signal Process. Lett , vol.23 , Issue.2 , pp. 202-206
    • Erro, D.1
  • 38
    • 84959095814 scopus 로고    scopus 로고
    • Intelligibility enhancement of casual speech for reverberant environments inspired by clear speech properties
    • M. Koutsogiannaki, P. N. Petkov, and Y. Stylianou, "Intelligibility enhancement of casual speech for reverberant environments inspired by clear speech properties," in Proc. Interspeech, 2015, pp. 65-69.
    • (2015) Proc. Interspeech , pp. 65-69
    • Koutsogiannaki, M.1    Petkov, P.N.2    Stylianou, Y.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.