SCOPUS 정보 검색 플랫폼

9th ISCA Speech Synthesis Workshop, SSW 2016

Volumn , Issue , 2016, Pages 44-51

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity

(13) Huang, Dong Yan a Xie, Lei b Lee, Yvonne Siu Wa a Wu, Jie b Ming, Huaiping a Tian, Xiaohai c Zhang, Shaofei b Ding, Chuang b Li, Mei b Nguyen, Quy Hy c Dong, Minghui a Chng, Eng Siong c Li, Haizhou a

a INSTITUTE FOR INFOCOMM RESEARCH (Singapore)

b NORTHWESTERN POLYTECHNICAL UNIVERSITY (China)

c NANYANG TECHNOLOGICAL UNIVERSITY (Singapore)

Author keywords

objective measures; speaker similarity score; speech quality assessment; subjective listening tests; Voice conversion

Indexed keywords

ACOUSTIC NOISE; SPEECH COMMUNICATION; SPEECH ENHANCEMENT; SPEECH SYNTHESIS;

BACKGROUND NOISE; EVALUATION STRATEGIES; LANGUAGE CONTENT; NOISE DISTORTIONS; OBJECTIVE MEASURE; SIMILARITY SCORES; SPEAKER SIMILARITY SCORE; SPEECH QUALITY ASSESSMENT; SUBJECTIVE LISTENING TEST; VOICE CONVERSION;

QUALITY CONTROL;

EID: 85075288991 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (7)

References (34)

1
- 84946065618
- Voice conversion
- J. Nurminen, H. Silen, and V. Popa, "Voice conversion," Speech Enhancement, Modeling and Recognition-Algorithms and Applications, pp. 69-94, 2012.
- (2012) Speech Enhancement, Modeling and Recognition-Algorithms and Applications , pp. 69-94
- Nurminen, J.¹ Silen, H.² Popa, V.³

2
- 0023739214
- Voice conversion through vector quantization
- Apr 1988
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in International Conference on Acoustics, Speech, and Signal Processing, 1988, Apr 1988, pp. 655-658 vol.1.
- (1988) International Conference on Acoustics, Speech, and Signal Processing , vol.1 , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

3
- 0032026483
- Continuous probabilistic transform for voice conversion
- Mar
- Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Transactions on Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, Mar 1998.
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

4
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A.W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

5
- 84946033919
- Modulation spectrum-constrained trajectory training algorithm for gmmbased voice conversion
- South Brisbane, Queensland, Australia, April 19-24, 2015
- S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modulation spectrum-constrained trajectory training algorithm for gmmbased voice conversion," in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, 2015, pp. 4859-4863.
- (2015) 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015 , pp. 4859-4863
- Takamichi, S.¹ Toda, T.² Black, A. W.³ Nakamura, S.⁴

6
- 84856141218
- Voice conversion using dynamic kernel partial least squares regression
- March
- E. Helander, H. Silen, T. Virtanen, and M. Gabbouj, "Voice conversion using dynamic kernel partial least squares regression," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 3, pp. 806-817, March 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.3 , pp. 806-817
- Helander, E.¹ Silen, H.² Virtanen, T.³ Gabbouj, M.⁴

7
- 84888226596
- A dynamic gaussian process for voice conversion
- July
- D.-Y. Huang, M. Dong, and H. Li, "A dynamic gaussian process for voice conversion," in Multimedia and Expo Workshops (ICMEW), 2013 IEEE International Conference on, July 2013, pp. 1-4.
- (2013) Multimedia and Expo Workshops (ICMEW), 2013 IEEE International Conference on , pp. 1-4
- Huang, D.-Y.¹ Dong, M.² Li, H.³

8
- 84911369131
- Exemplar-based sparse representation with residual compensation for voice conversion
- Z. Wu, T. Virtanen, E. Chng, and H. Li, "Exemplar-based sparse representation with residual compensation for voice conversion," IEEE/ACM Trans. Audio, Speech & Language Processing, vol. 22, pp. 1506-1521, 2014.
- (2014) IEEE/ACM Trans. Audio, Speech & Language Processing , vol.22 , pp. 1506-1521
- Wu, Z.¹ Virtanen, T.² Chng, E.³ Li, H.⁴

9
- 70349197691
- Voice conversion using artificial neural networks
- April
- S. Desai, E. V. Raghavendra, B. Yegnanarayana, A.W. Black, and K. Prahallad, "Voice conversion using artificial neural networks," in 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, April 2009, pp. 3893-3896.
- (2009) 2009 IEEE International Conference on Acoustics, Speech and Signal Processing , pp. 3893-3896
- Desai, S.¹ Raghavendra, E. V.² Yegnanarayana, B.³ Black, A.W.⁴ Prahallad, K.⁵

10
- 84921735339
- Voice conversion using deep neural networks with layer-wise generative training
- Dec
- L. H. Chen, Z. H. Ling, L. J. Liu, and L. R. Dai, "Voice conversion using deep neural networks with layer-wise generative training," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 12, pp. 1859-1872, Dec 2014.
- (2014) IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol.22 , Issue.12 , pp. 1859-1872
- Chen, L. H.¹ Ling, Z. H.² Liu, L. J.³ Dai, L. R.⁴

11
- 84946027999
- Voice conversion using deep bidirectional long short-term memory based recurrent neural networks
- April
- L. Sun, S. Kang, K. Li, and H. Meng, "Voice conversion using deep bidirectional long short-term memory based recurrent neural networks," in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2015, pp. 4869-4873.
- (2015) 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 4869-4873
- Sun, L.¹ Kang, S.² Li, K.³ Meng, H.⁴

12
- 0003639435
- nternational Telecommunications Union, Geneva, Switzerland
- "Perceptual evaluation of speech quality (pesq), an objective method for end-to-end speech quality assessment of 3.1 khz handset telephony (narrow-band) networks and speech codecs," nternational Telecommunications Union, Geneva, Switzerland, 2001.
- (2001) Perceptual evaluation of speech quality (pesq), an objective method for end-to-end speech quality assessment of 3.1 khz handset telephony (narrow-band) networks and speech codecs

13
- 44149106061
- Evaluation of objective quality measures for speech enhancement
- Jan
- Y. Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 1, pp. 229-238, Jan 2008.
- (2008) IEEE Transactions on Audio, Speech, and Language Processing , vol.16 , Issue.1 , pp. 229-238
- Hu, Y.¹ Loizou, P. C.²

14
- 84866873313
- Prediction of perceived sound quality of synthetic speech
- Xi'an, China
- D.-Y. Huang, "Prediction of perceived sound quality of synthetic speech," in In: Proc. APSIPA ASC., Xi'an, China, 2011.
- (2011) Proc. APSIPA ASC
- Huang, D.-Y.¹

15
- 85133202315
- Rapid computation of i-vector
- L. Xu, K. A. Lee, H. Li, and Z. Yang, "Rapid computation of i-vector," in Proc. Odyssey, 2016.
- (2016) Proc. Odyssey
- Xu, L.¹ Lee, K. A.² Li, H.³ Yang, Z.⁴

16
- 79951609039
- Front-end factor analysis for speaker verification
- N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-end factor analysis for speaker verification," IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 4, p. 788798, 2011.
- (2011) IEEE Transactions on Audio, Speech, and Language Processing , vol.19 , Issue.4 , pp. 788798
- Dehak, N.¹ Kenny, P.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

17
- 84910071971
- A comparative study of spectral transformation techniques for singing voice synthesis
- S. W. Lee, Z. Wu, M. Dong, X. Tian, and H. Li, "A comparative study of spectral transformation techniques for singing voice synthesis," in Proc. Interspeech, 2014, pp. 2499-2503.
- (2014) Proc. Interspeech , pp. 2499-2503
- Lee, S. W.¹ Wu, Z.² Dong, M.³ Tian, X.⁴ Li, H.⁵

18
- 84869779548
- Modular global variance enhancement for voice conversion systems
- Aug
- H. Benisty, D. Malah, and K. Crammer, "Modular global variance enhancement for voice conversion systems," in Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European, Aug 2012, pp. 370-374.
- (2012) Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European , pp. 370-374
- Benisty, H.¹ Malah, D.² Crammer, K.³

19
- 84929415540
- Voice conversion using conditional restricted boltzmann machine
- July
- F. Zhu, Z. Fan, and X. Wu, "Voice conversion using conditional restricted boltzmann machine," in Signal and Information Processing (ChinaSIP), 2014 IEEE China Summit International Conference on, July 2014, pp. 110-114.
- (2014) Signal and Information Processing (ChinaSIP), 2014 IEEE China Summit International Conference on , pp. 110-114
- Zhu, F.¹ Fan, Z.² Wu, X.³

20
- 80053068819
- Voice conversion using support vector regression
- September
- P. Song, Y. Q. Bao, L. Zhao, and C. R. Zou, "Voice conversion using support vector regression," Electronics Letters, vol. 47, no. 18, pp. 1045-1046, September 2011.
- (2011) Electronics Letters , vol.47 , Issue.18 , pp. 1045-1046
- Song, P.¹ Bao, Y. Q.² Zhao, L.³ Zou, C. R.⁴

21
- 33646900967
- Voice conversion based on piecewise linear conversion rules of formant frequency and spectrum tilt
- Apr vol.1
- H. Mizuno and M. Abe, "Voice conversion based on piecewise linear conversion rules of formant frequency and spectrum tilt," in Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on, vol. i, Apr 1994, pp. I/469-I/472 vol.1.
- (1994) Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on , vol.i , pp. I469-I472
- Mizuno, H.¹ Abe, M.²

22
- 77953727123
- Voice conversion based on weighted frequency warping
- July
- D. Erro, A. Moreno, and A. Bonafonte, "Voice conversion based on weighted frequency warping," IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 5, pp. 922-931, July 2010.
- (2010) IEEE Transactions on Audio, Speech, and Language Processing , vol.18 , Issue.5 , pp. 922-931
- Erro, D.¹ Moreno, A.² Bonafonte, A.³

23
- 84857498745
- Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
- May
- E. Godoy, O. Rosec, and T. Chonavel, "Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 4, pp. 1313-1323, May 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.4 , pp. 1313-1323
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

24
- 5444243681
- Speaker-specific pitch contour modeling and modification
- May vol.2
- D. T. Chappell and J. H. L. Hansen, "Speaker-specific pitch contour modeling and modification," in Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, vol. 2, May 1998, pp. 885-888 vol.2.
- (1998) Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on , vol.2 , pp. 885-888
- Chappell, D. T.¹ Hansen, J. H. L.²

25
- 33947693233
- Z. Inanoglu, "Transforming pitch in a voice conversion framework," 2003.
- (2003) Transforming pitch in a voice conversion framework
- Inanoglu, Z.¹

26
- 85009212516
- Transforming F0 contours
- Geneva, Switzerland, September 1-4, 2003
- B. Gillett and S. King, "Transforming F0 contours," in 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - INTERSPEECH 2003, Geneva, Switzerland, September 1-4, 2003, 2003.
- (2003) 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - INTERSPEECH 2003
- Gillett, B.¹ King, S.²

27
- 85133206107
- H. Ming, D.-Y. Huang, L. Xie, S. Zhang, M. Dong, and H. Li, "Exemplar-based sparse representation of timbre and prosody for voice conversion," 2016.
- (2016) Exemplar-based sparse representation of timbre and prosody for voice conversion
- Ming, H.¹ Huang, D.-Y.² Xie, L.³ Zhang, S.⁴ Dong, M.⁵ Li, H.⁶

28
- 84959163883
- System fusion for high-performance voice conversion
- Dresden, Germany, September 6-10, 2015
- X. Tian, Z. Wu, S. W. Lee, N. Q. Hy, M. Dong, and E. Chng, "System fusion for high-performance voice conversion," in INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, 2015, pp. 2759-2763.
- (2015) INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association , pp. 2759-2763
- Tian, X.¹ Wu, Z.² Lee, S. W.³ Hy, N. Q.⁴ Dong, M.⁵ Chng, E.⁶

29
- 84946020861
- Sparse representation for frequency warping based voice conversion
- X. Tian, Z. Wu, S. W. Lee, N. Q. Hy, E. Chng, and M. Dong, "Sparse representation for frequency warping based voice conversion." in ICASSP. IEEE, 2015, pp. 4235-4239.
- (2015) ICASSP. IEEE , pp. 4235-4239
- Tian, X.¹ Wu, Z.² Lee, S. W.³ Hy, N. Q.⁴ Chng, E.⁵ Dong, M.⁶

30
- 85133193533
- Transformation of vocal characteristics: A review of literature
- D.-Y. Huang, E. P. Ong, S. Rahardja, M. Dong, and H. Li, "Transformation of vocal characteristics: A review of literature," International Journal of Electrical, Computer, Energetic, Electronic and Communication Engineering, vol. 3, no. 12, pp. 77-85, 2009.
- (2009) International Journal of Electrical, Computer, Energetic, Electronic and Communication Engineering , vol.3 , Issue.12 , pp. 77-85
- Huang, D.-Y.¹ Ong, E. P.² Rahardja, S.³ Dong, M.⁴ Li, H.⁵

31
- 85118466198
- An effective quality evaluation protocol for speech enhancement algorithms
- Sydney, Australia
- J. Hansen and B. Pellom, "An effective quality evaluation protocol for speech enhancement algorithms," in Proc. ICSLP, Sydney, Australia, 1998, pp. 81-84.
- (1998) Proc. ICSLP , pp. 81-84
- Hansen, J.¹ Pellom, B.²

32
- 84989489267
- Prediction of perceived phonetic distance from criticalband spectra: A first step
- D. Klatt, "Prediction of perceived phonetic distance from criticalband spectra: A first step," in Proc. IEEE ICASSP, 1982, pp. 1278-1281.
- (1982) Proc. IEEE ICASSP , pp. 1278-1281
- Klatt, D.¹

33
- 79951654574
- Ph.D. dissertation, École de Technologie Supérieure, Montreal
- N. Dehak, "Discriminative and generative approches for longand short-term speaker characteristics modeling: Application to speaker verification," Ph.D. dissertation, École de Technologie Supérieure, Montreal, 2009.
- (2009) Discriminative and generative approches for longand short-term speaker characteristics modeling: Application to speaker verification
- Dehak, N.¹

34
- 84865733857
- Analysis of i-vector length normalization in speaker recognition systems
- Florence, Italy, August 27-31, 2011
- D. Garcia-Romero and C. Y. Espy-Wilson, "Analysis of i-vector length normalization in speaker recognition systems," in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011, 2011, pp. 249-252.
- (2011) INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association , pp. 249-252
- Garcia-Romero, D.¹ Espy-Wilson, C. Y.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.