SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 08-12-September-2016, Issue , 2016, Pages 1662-1666

ML parameter generation with a reformulated MGE training criterion-participation in the voice conversion challenge 2016

(11) Erro, D a,b Alonso, A a Serrano, L a Tavarez, D a Odriozola, I a Sarasola, X a Del Blanco, E a Sanchez, J a Saratxaga, I a Navas, E a Hernaez, I a

a UNIVERSITY OF THE BASQUE COUNTRY UPV EHU (Spain)

b BASQUE FOUNDATION FOR SCIENCE (Spain)

Author keywords

Cepstral postfilter; Linear regression; Maximum likelihood parameter generation; Minimum generation error; Voice conversion

Indexed keywords

LINEAR REGRESSION; MATHEMATICAL TRANSFORMATIONS; MAXIMUM LIKELIHOOD; SPEECH COMMUNICATION;

DYNAMIC TIME WARPING; FUNDAMENTAL FREQUENCIES; GENERATION ALGORITHM; MINIMUM GENERATION ERRORS; POSTFILTERS; SOFT CLASSIFICATION; SPEAKER INDEPENDENTS; VOICE CONVERSION;

SPEECH PROCESSING;

EID: 84994385904 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: 10.21437/Interspeech.2016-219 Document Type: Conference Paper

Times cited : (3)

References (38)

1
- 0023739214
- Voice conversion through vector quantization
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in Proc. ICASSP, 1988, pp. 655-658.
- (1988) Proc. ICASSP , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

2
- 0033154052
- Speaker transformation algorithm using segmental codebooks (STASC)
- L. M. Arslan, "Speaker transformation algorithm using segmental codebooks (STASC)," Speech Commun., vol. 28, no. 3, pp. 211-226, 1999.
- (1999) Speech Commun , vol.28 , Issue.3 , pp. 211-226
- Arslan, L.M.¹

3
- 84994241109
- Including dynamic and phonetic information in voice conversion systems
- H. Duxans, A. Bonafonte, A. Kain, and J. P. H. van Santen, "Including dynamic and phonetic information in voice conversion systems," in Proc. Interspeech, 2004, pp. 1193-1196.
- (2004) Proc. Interspeech , pp. 1193-1196
- Duxans, H.¹ Bonafonte, A.² Kain, A.³ Van Santen, J.P.H.⁴

4
- 78049373493
- Pronunciation variation generation for spontaneous speech synthesis using state-based voice transformation
- C. H. Lee, C. H.Wu, and J. C. Guo, "Pronunciation variation generation for spontaneous speech synthesis using state-based voice transformation," in Proc. ICASSP, 2010, pp. 4826-4829.
- (2010) Proc. ICASSP , pp. 4826-4829
- Lee, C.H.¹ Wu, H.C.² Guo, J.C.³

5
- 78149260085
- Continuous stochastic feature mapping based on trajectory HMMs
- H. Zen, Y. Nankaku, and K. Tokuda, "Continuous stochastic feature mapping based on trajectory HMMs," IEEE Trans. Audio, Speech & Lang. Process., vol. 19, no. 2, pp. 417-430, 2011.
- (2011) IEEE Trans. Audio, Speech & Lang. Process , vol.19 , Issue.2 , pp. 417-430
- Zen, H.¹ Nankaku, Y.² Tokuda, K.³

6
- 0032026483
- Continuous probabilistic transform for voice conversion
- Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech & Audio Process., vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. Speech & Audio Process , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappé, O.² Moulines, E.³

7
- 0031623661
- Spectral voice conversion for text-tospeech synthesis
- A. Kain and M.W. Macon, "Spectral voice conversion for text-tospeech synthesis," in Proc. ICASSP, 1998, pp. 285-288.
- (1998) Proc. ICASSP , pp. 285-288
- Kain, A.¹ Macon, M.W.²

8
- 34047254509
- Quality-enhanced voice morphing using maximum likelihood transformations
- H. Ye and S. J. Young, "Quality-enhanced voice morphing using maximum likelihood transformations," IEEE Trans. Audio, Speech, & Lang. Process., vol. 14, no. 4, pp. 1301-1312, 2006.
- (2006) IEEE Trans. Audio, Speech, & Lang. Process , vol.14 , Issue.4 , pp. 1301-1312
- Ye, H.¹ Young, S.J.²

9
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, & Lang. Process., vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. Audio, Speech, & Lang. Process , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.² Tokuda, K.³

10
- 77953712499
- Voice conversion using partial least squares regression
- E. Helander, T. Virtanen, J. Nurminen, and M. Gabbouj, "Voice conversion using partial least squares regression," IEEE Trans. Audio, Speech & Language Process., vol. 18, no. 5, pp. 912-921, 2010.
- (2010) IEEE Trans. Audio, Speech & Language Process , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

11
- 84865754815
- Voice conversion using GMM with enhanced global variance
- H. Benisty and D. Malah, "Voice conversion using GMM with enhanced global variance," in Proc. Interspeech, 2011, pp. 669-672.
- (2011) Proc. Interspeech , pp. 669-672
- Benisty, H.¹ Malah, D.²

12
- 84890539284
- Voice conversion based on Gaussian processes by coherent and asymmetric training with limited training data
- N. Xu, Y. Tang, J. Bao, A. Jiang, X. Liu, and Z. Yang, "Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data," Speech Commun., vol. 58, pp. 124-138, 2014.
- (2014) Speech Commun , vol.58 , pp. 124-138
- Xu, N.¹ Tang, Y.² Bao, J.³ Jiang, A.⁴ Liu, X.⁵ Yang, Z.⁶

13
- 0029254176
- Transformation of formants for voice conversion using artificial neural networks
- M. Narendranath, H. A. Murthy, S. Rajendran, and B. Yegnanarayana, "Transformation of formants for voice conversion using artificial neural networks," Speech Commun., vol. 16, no. 2, pp. 207-216, 1995.
- (1995) Speech Commun , vol.16 , Issue.2 , pp. 207-216
- Narendranath, M.¹ Murthy, H.A.² Rajendran, S.³ Yegnanarayana, B.⁴

14
- 77953707533
- Spectral mapping using artificial neural networks for voice conversion
- S. Desai, A. W. Black, B. Yegnanarayana, and K. Prahallad, "Spectral mapping using artificial neural networks for voice conversion," IEEE Trans. Audio, Speech, and Language Process., vol. 18, no. 5, pp. 954-964, 2010.
- (2010) IEEE Trans. Audio, Speech, and Language Process , vol.18 , Issue.5 , pp. 954-964
- Desai, S.¹ Black, A.W.² Yegnanarayana, B.³ Prahallad, K.⁴

15
- 84946685887
- Voice conversion using deep neural networks with speaker-independent pre-training
- S. H. Mohammadi and A. Kain, "Voice conversion using deep neural networks with speaker-independent pre-training," in Proc. IEEE SLT, 2014, pp. 19-23.
- (2014) Proc. IEEE SLT , pp. 19-23
- Mohammadi, S.H.¹ Kain, A.²

16
- 85032750981
- Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
- Z.-H. Ling, S.-Y. Kang, H. Zen, A. Senior, M. Schuster, X.-J. Qian, H. Meng, and L. Deng, "Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends," IEEE Signal Process. Mag., vol. 32, no. 3, pp. 35-52, 2015.
- (2015) IEEE Signal Process. Mag , vol.32 , Issue.3 , pp. 35-52
- Ling, Z.-H.¹ Kang, S.-Y.² Zen, H.³ Senior, A.⁴ Schuster, M.⁵ Qian, X.-J.⁶ Meng, H.⁷ Deng, L.⁸

17
- 33646900967
- Voice conversion based on piecewise linear conversion rules of formant frequency and spectrum tilt
- H. Mizuno and M. Abe, "Voice conversion based on piecewise linear conversion rules of formant frequency and spectrum tilt," in Proc. ICASSP, 1994, pp. 469-472.
- (1994) Proc. ICASSP , pp. 469-472
- Mizuno, H.¹ Abe, M.²

18
- 4544361661
- Voice conversion through transformation of spectral and intonation features
- D. Rentzos, S. Vaseghi, Q. Yan, and C.-H. Ho, "Voice conversion through transformation of spectral and intonation features," in Proc. ICASSP, 2004, pp. 21-24.
- (2004) Proc. ICASSP , pp. 21-24
- Rentzos, D.¹ Vaseghi, S.² Yan, Q.³ Ho, C.-H.⁴

19
- 0026880275
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and J. Tubach, "Voice transformation using PSOLA technique," Speech Commun., vol. 11, no. 2-3, pp. 175-187, 1992.
- (1992) Speech Commun , vol.11 , Issue.2-3 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.³

20
- 33745209887
- Evaluation of VTLN-based voice conversion for embedded speech synthesis
- D. Suendermann, G. Strecha, A. Bonafonte, H. Hoege, and H. Ney, "Evaluation of VTLN-based voice conversion for embedded speech synthesis," in Proc. Interspeech, 2005, pp. 2593-2596.
- (2005) Proc. Interspeech , pp. 2593-2596
- Suendermann, D.¹ Strecha, G.² Bonafonte, A.³ Hoege, H.⁴ Ney, H.⁵

21
- 34547507542
- Frequency warping based on mapping formant parameters
- Z. Shuang, R. Bakis, S. Shechtman, D. Chazan, and Y. Qin, "Frequency warping based on mapping formant parameters," in Proc. Interspeech, 2006, pp. 2290-2293.
- (2006) Proc. Interspeech , pp. 2290-2293
- Shuang, Z.¹ Bakis, R.² Shechtman, S.³ Chazan, D.⁴ Qin, Y.⁵

22
- 80051619373
- One sentence voice adaptation using gmm-based frequency-warping and shift with a sub-band basis spectrum model
- M. Tamura, M. Morita, T. Kagoshima, and M. Akamine, "One sentence voice adaptation using gmm-based frequency-warping and shift with a sub-band basis spectrum model," in Proc. ICASSP, 2011, pp. 5124-5127.
- (2011) Proc. ICASSP , pp. 5124-5127
- Tamura, M.¹ Morita, M.² Kagoshima, T.³ Akamine, M.⁴

23
- 84857498745
- Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
- E. Godoy, O. Rosec, and T. Chonavel, "Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora," IEEE Trans. Audio, Speech & Lang. Process., vol. 20, no. 4, pp. 1313-1323, 2012.
- (2012) IEEE Trans. Audio, Speech & Lang. Process , vol.20 , Issue.4 , pp. 1313-1323
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

24
- 34547496196
- Towards a voice conversion system based on frame selection
- T. Dutoit, A. Holzapfel, M. Jottrand, A. Moinet, J. Perez, and Y. Stylianou, "Towards a voice conversion system based on frame selection," in Proc. ICASSP, 2007, pp. 513-516.
- (2007) Proc. ICASSP , pp. 513-516
- Dutoit, T.¹ Holzapfel, A.² Jottrand, M.³ Moinet, A.⁴ Perez, J.⁵ Stylianou, Y.⁶

25
- 84896464538
- A unit selection approach for voice transformation
- K.-S. Lee, "A unit selection approach for voice transformation," Speech Commun., vol. 60, pp. 30-43, 2014.
- (2014) Speech Commun , vol.60 , pp. 30-43
- Lee, K.-S.¹

26
- 84911369131
- Exemplar-based sparse representation with residual compensation for voice conversion
- Z. Wu, T. Virtanen, E. Chng, and H. Li, "Exemplar-based sparse representation with residual compensation for voice conversion," IEEE/ACM Trans. Audio, Speech, & Lang. Process., vol. 22, no. 10, pp. 1506-1521, 2014.
- (2014) IEEE/ACM Trans. Audio, Speech, & Lang. Process , vol.22 , Issue.10 , pp. 1506-1521
- Wu, Z.¹ Virtanen, T.² Chng, E.³ Li, H.⁴

27
- 0034842552
- Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum
- T. Toda, H. Saruwatari, and K. Shikano, "Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum," in Proc. ICASSP, 2001, pp. 841-844.
- (2001) Proc. ICASSP , pp. 841-844
- Toda, T.¹ Saruwatari, H.² Shikano, K.³

28
- 84959163883
- System fusion for high-performance voice conversion
- X. Tian, Z. Wu, S. W. Lee, N. Q. Hy, M. Dong, and E. Chng, "System fusion for high-performance voice conversion," in Proc. Interspeech, 2015, pp. 2759-2763.
- (2015) Proc. Interspeech , pp. 2759-2763
- Tian, X.¹ Wu, Z.² Lee, S.W.³ Hy, N.Q.⁴ Dong, M.⁵ Chng, E.⁶

29
- 33947623206
- Text-independent voice conversion based on unit selection
- D. Suendermann, H. Hoege, A. Bonafonte, H. Ney, A. W. Black, and S. S. Narayanan, "Text-independent voice conversion based on unit selection," in Proc. ICASSP, 2006, pp. 81-84.
- (2006) Proc. ICASSP , pp. 81-84
- Suendermann, D.¹ Hoege, H.² Bonafonte, A.³ Ney, H.⁴ Black, A.W.⁵ Narayanan, S.S.⁶

30
- 51449112440
- Voice conversion by combining frequency warping with unit selection
- Z. Shuang, F. Meng, and Y. Qin, "Voice conversion by combining frequency warping with unit selection," in Proc. ICASSP, 2008, pp. 4661-4664.
- (2008) Proc. ICASSP , pp. 4661-4664
- Shuang, Z.¹ Meng, F.² Qin, Y.³

31
- 77953727123
- Voice conversion based on weighted frequency warping
- D. Erro, A. Moreno, and A. Bonafonte, "Voice conversion based on weighted frequency warping," IEEE Trans. Audio, Speech, and Language Process., vol. 18, no. 5, pp. 922-931, 2010.
- (2010) IEEE Trans. Audio, Speech, and Language Process , vol.18 , Issue.5 , pp. 922-931
- Erro, D.¹ Moreno, A.² Bonafonte, A.³

32
- 84872177757
- Parametric voice conversion based on bilinear frequency warping plus amplitude scaling
- D. Erro, E. Navas, and I. Hernaez, "Parametric voice conversion based on bilinear frequency warping plus amplitude scaling," IEEE Trans. Audio, Speech, & Lang. Process., vol. 21, no. 3, pp. 556-566, 2013.
- (2013) IEEE Trans. Audio, Speech, & Lang. Process , vol.21 , Issue.3 , pp. 556-566
- Erro, D.¹ Navas, E.² Hernaez, I.³

33
- 84994361374
- The voice conversion challenge 2016
- T. Toda, L. H. Chen, D. Saito, F. Villavicencio, M.Wester, Z.Wu, and J. Yamagishi, "The Voice Conversion Challenge 2016," in Proc. Interspeech, 2016.
- (2016) Proc. Interspeech
- Toda, T.¹ Chen, L.H.² Saito, D.³ Villavicencio, F.⁴ Wester, M.⁵ Wu, Z.⁶ Yamagishi, J.⁷

34
- 70450183499
- An improved minimum generation error based model adaptation for HMM-based speech synthesis
- Y. Wu, L. Qin, and K. Tokuda, "An improved minimum generation error based model adaptation for HMM-based speech synthesis," in Proc. Interspeech, 2009, pp. 1787-1790.
- (2009) Proc. Interspeech , pp. 1787-1790
- Wu, Y.¹ Qin, L.² Tokuda, K.³

35
- 84897865577
- Harmonics plus noise model based vocoder for statistical parametric speech synthesis
- D. Erro, I. Sainz, E. Navas, and I. Hernáez, "Harmonics plus noise model based vocoder for statistical parametric speech synthesis," IEEE Journal Sel. Topics in Signal Process., vol. 8, no. 2, pp. 184-194, 2014.
- (2014) IEEE Journal Sel. Topics in Signal Process , vol.8 , Issue.2 , pp. 184-194
- Erro, D.¹ Sainz, I.² Navas, E.³ Hernáez, I.⁴

36
- 84878409257
- Iterative MMSE estimation of vocal tract length normalization factors for voice transformation
- D. Erro, E. Navas, and I. Hernáez, "Iterative MMSE estimation of vocal tract length normalization factors for voice transformation," in Proc. Interspeech, 2012, pp. 86-89.
- (2012) Proc. Interspeech , pp. 86-89
- Erro, D.¹ Navas, E.² Hernáez, I.³

37
- 85009387557
- Two-band radial postfiltering in cepstral domain with application to speech synthesis
- D. Erro, "Two-band radial postfiltering in cepstral domain with application to speech synthesis," IEEE Signal Process. Lett., vol. 23, no. 2, pp. 202-206, 2016.
- IEEE Signal Process. Lett , vol.23 , Issue.2 , pp. 202-206
- Erro, D.¹

38
- 84959095814
- Intelligibility enhancement of casual speech for reverberant environments inspired by clear speech properties
- M. Koutsogiannaki, P. N. Petkov, and Y. Stylianou, "Intelligibility enhancement of casual speech for reverberant environments inspired by clear speech properties," in Proc. Interspeech, 2015, pp. 65-69.
- (2015) Proc. Interspeech , pp. 65-69
- Koutsogiannaki, M.¹ Petkov, P.N.² Stylianou, Y.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.