SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 60, Issue , 2014, Pages 30-43

A unit selection approach for voice transformation

(1) Lee, Ki Seung a

a Konkuk University (South Korea)

Author keywords

Hidden Markov model; Unit selection; Voice conversion

Indexed keywords

HIDDEN MARKOV MODELS; OPTIMIZATION;

CONVENTIONAL METHODS; LINEAR PREDICTION COEFFICIENTS; MAXIMUM LIKELIHOOD CRITERIA; STATISTICAL APPROACH; TRANSFORMATION METHODS; UNIT SELECTION; UNIT SELECTION APPROACH; VOICE CONVERSION;

SPEECH PROCESSING;

EID: 84896464538 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2014.02.002 Document Type: Article

Times cited : (3)

References (45)

1
- 0023739214
- Voice conversion through vector quantization
- Abe, M.; Nakamura, S.; Shikano, K.; Kuwabara, H.; 1988. Voice conversion through vector quantization. In: Proc. ICASSP, pp. 565-568.
- (1988) Proc. ICASSP , pp. 565-568
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

2
- 0033154052
- Speaker transformation algorithm using segmental codebooks (STASC)
- L.M. Arslan Speaker transformation algorithm using segmental codebooks (STASC) Speech Commun. 28 1999 211 226
- (1999) Speech Commun. , vol.28 , pp. 211-226
- Arslan, L.M.¹

3
- 0002425861
- The AT&T next-gen TTS system
- Beutnagel, M.; Conkie, A.; Schroeter, J.; Stylianou, Y.; Syrdal, A.; 1999. The AT&T next-gen TTS system. In: Proc. Joint Meeting of ASA, EAA, and DAGA.
- (1999) Proc. Joint Meeting of ASA, EAA, and DAGA
- Beutnagel, M.¹ Conkie, A.² Schroeter, J.³ Stylianou, Y.⁴ Syrdal, A.⁵

4
- 0031104132
- Application of Speech Conversion to Alaryngeal Speech Enhancement
- PII S1063667697018944
- N. Bi, and Y. Qi Application of speech conversion to alaryngeal speech enhancement IEEE Trans. Acoust. Speech Signal Process. 5 2 1997 97 105 (Pubitemid 127746041)
- (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , Issue.2 , pp. 97-105
- Bi, N.¹ Qi, Y.²

5
- 0028517647
- Statistical recovery of wideband speech from narrowband speech
- Y.M. Cheng, P. O'Shaughnessy, and P. Mermelstein Statistical recovery of wideband speech from narrowband speech IEEE Trans. Speech Audio Process. 2 4 1994 544 548
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.4 , pp. 544-548
- Cheng, Y.M.¹ O'Shaughnessy, P.² Mermelstein, P.³

6
- 0028016265
- Measuring and modeling vocal source-tract interaction
- D.G. Childers, and C.F. Wong Measuring and modeling vocal source-tract interaction IEEE Trans. Biomed. Eng. 41 7 1994 663 671
- (1994) IEEE Trans. Biomed. Eng. , vol.41 , Issue.7 , pp. 663-671
- Childers, D.G.¹ Wong, C.F.²

7
- 0022203520
- Voice conversion: Factors responsible for quality
- Childers, D.G.; Yegnanarayana, B.; Wu, K.; 1985. Voice conversion: factors responsible for quality. In: Proc. ICASSP, pp. 748-751. (Pubitemid 16511455)
- (1985) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , pp. 748-751
- Childers, D.G.¹ Yegnanarayana, B.² Wu, K.³

8
- 0024940640
- Unsupervised speaker adaptation by probabilistic spectrum fitting
- Cox, S.J.; Bridle, J.S.; 1989. Unsupervised speaker adaptation by probabilistic spectrum fitting. In: Proc. ICASSP, pp. 294-297. (Pubitemid 20604113)
- (1989) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 294-297
- Cox, S.J.¹ Bridle, J.S.²

9
- 34547496196
- Towards a voice conversion system based on frame selection
- Dutoit, T.; Holzapfel, A.; Jottrand, M.; Moinet, A.; Perez, J.; Stylianou, Y.; 2007. Towards a voice conversion system based on frame selection. In: Proc. ICASSP, pp. 15-20.
- (2007) Proc. ICASSP , pp. 15-20
- Dutoit, T.¹ Holzapfel, A.² Jottrand, M.³ Moinet, A.⁴ Perez, J.⁵ Stylianou, Y.⁶

10
- 1542408811
- The interaction of formant frequency and pitch in the perception of voice category and jaw opening in female singers
- DOI 10.1016/j.jvoice.2003.08.001, PII S0892199703001243
- Erickson, M.L.; 2003. The interaction of formant frequency and pitch in the perception of voice category and jaw opening in female singers. In: The 31st Annual Symposium: Care of the Professional Voice, pp. 24-37. (Pubitemid 38333008)
- (2004) Journal of Voice , vol.18 , Issue.1 , pp. 24-37
- Erickson, M.L.¹

11
- 84872177757
- Parametric voice conversion based on bilinear frequency warping plus amplitude scaling
- D. Erro, E. Navas, and I. Hernáez Parametric voice conversion based on bilinear frequency warping plus amplitude scaling IEEE Trans. Audio Speech Lang. Process. 21 3 2013 556 566
- (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , Issue.3 , pp. 556-566
- Erro, D.¹ Navas, E.² Hernáez, I.³

12
- 84856141218
- Voice conversion using dynamic kernel partial least squares regression
- E. Helander, H. Silén, and T. Virtanen Voice conversion using dynamic kernel partial least squares regression IEEE Trans. Audio Speech Lang. Process. 20 3 2012 806 817
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.3 , pp. 806-817
- Helander, E.¹ Silén, H.² Virtanen, T.³

13
- 84867950508
- Personalized spectral and prosody conversion using frame-based codeword distribution and adaptive CRF
- Y.C. Huang, C.H. Wu, and Y.T. Chao Personalized spectral and prosody conversion using frame-based codeword distribution and adaptive CRF IEEE Trans. Audio Speech Lang. Process. 21 1 2013 51 52
- (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , Issue.1 , pp. 51-52
- Huang, Y.C.¹ Wu, C.H.² Chao, Y.T.³

14
- 0029251946
- Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks
- N. Iwahashi, and Y. Sagisaka Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks Speech Commun. 16 2 1995 139 152
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 139-152
- Iwahashi, N.¹ Sagisaka, Y.²

15
- 43749111966
- Voice conversion using Viterbi algorithm based on Gaussian mixture model
- Jian, Z.H.; Zhen, Y.; 2007. Voice conversion using Viterbi algorithm based on Gaussian mixture model. In: Proc. Intelligent Signal Processing and Communication Systems, pp. 32-35.
- (2007) Proc. Intelligent Signal Processing and Communication Systems , pp. 32-35
- Jian Z., .H.¹ Zhen, Y.²

16
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- Kain, A.; Macon, M.W.; 1998. Spectral voice conversion for text-to-speech synthesis. In: Proc. ICASSP, Seattle, pp. 285-288.
- (1998) Proc. ICASSP, Seattle , pp. 285-288
- Kain, A.¹ Macon M., .W.²

17
- 0034841948
- Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction
- Kain, A.; Macon, M.W.; 2001. Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. In: Proc. ICASSP, pp. 813-816. (Pubitemid 32839044)
- (2001) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.2 , pp. 813-816
- Kain, A.¹ Macon, M.W.²

18
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds Speech Commun. 27 3-4 1999 187 207
- (1999) Speech Commun. , vol.27 , Issue.34 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigne, A.³

19
- 85090475413
- The CMU ARCTIC speech databases
- Kominek, J.; Black, A.W.; 2004. The CMU ARCTIC speech databases. In: Proc. Fifth ISCA Speech Synthesis Workshop, pp. 223-224.
- (2004) Proc. Fifth ISCA Speech Synthesis Workshop , pp. 223-224
- Kominek, J.¹ Black A., .W.²

20
- 38149065136
- Statistical approach for voice personality transformation
- K.S. Lee Statistical approach for voice personality transformation IEEE Trans. Audio Speech Lang. Process. 15 2 2007 641 651
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.2 , pp. 641-651
- Lee, K.S.¹

21
- 39749106069
- EMG-based speech recognition using hidden Markov models with global control variables
- K.S. Lee EMG-based speech recognition using hidden Markov models with global control variables IEEE Trans. Biomed. Eng. 55 3 2008 930 940
- (2008) IEEE Trans. Biomed. Eng. , vol.55 , Issue.3 , pp. 930-940
- Lee, K.S.¹

22
- 0030365550
- A new voice personality transformation based on both linear and nonlinear prediction analysis
- Lee, K.S.; Youn, D.H.; Cha, I.W.; 1996. A new voice personality transformation based on both linear and nonlinear prediction analysis. In: Proc. ICSLP, pp. 1401-1404.
- (1996) Proc. ICSLP , pp. 1401-1404
- Lee K., .S.¹ Youn D., .H.² Cha I., .W.³

23
- 0036670960
- Voice conversion using a low dimensional vector mapping
- K.S. Lee, D.H. Youn, and I.W. Cha Voice conversion using a low dimensional vector mapping IEICE Trans. Inf. Syst. E85-D 8 2002 1297 1305
- (2002) IEICE Trans. Inf. Syst. , vol.85 E -D , Issue.8 , pp. 1297-1305
- Lee, K.S.¹ Youn, D.H.² Cha, I.W.³

24
- 0018918171
- An algorithm for vector quantizer design
- Y. Linde, A. Buzo, and R.M. Gray An algorithm for vector quantizer design IEEE Trans. Commun. 28 1980 84 95
- (1980) IEEE Trans. Commun. , vol.28 , pp. 84-95
- Linde, Y.¹ Buzo, A.² Gray, R.M.³

25
- 33847332065
- Voice conversion based on joint pitch and spectral transformation with component group-GMM
- Ma, J.; Liu, W.; 2005. Voice conversion based on joint pitch and spectral transformation with component group-GMM. In: Proc. IEEE NLP-KE, pp. 199-203.
- (2005) Proc. IEEE NLP-KE , pp. 199-203
- Ma, J.¹ Liu, W.²

26
- 11044234619
- Multi-stream HMM for EMG-based speech recognition
- Conference Proceedings - 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2004
- Manabe, H.; Zhang, A.; 2004. Multi-stream HMM for EMG-based speech recognition. In: Proc. the 26th Annual International Conference of the IEEE EMBS, pp. 4389-4392. (Pubitemid 40043250)
- (2004) Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings , vol.26 , pp. 4389-4392
- Manabe, H.¹ Zhang, Z.²

27
- 0029256372
- Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectral tilt
- H. Mizuno, and M. Abe Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectral tilt Speech Commun. 16 2 1995 153 164
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 153-164
- Mizuno, H.¹ Abe, M.²

28
- 0025543906
- Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones
- E. Moulines, and F. Charpentier Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones Speech Commun. 9 5/6 1990 453 467
- (1990) Speech Commun. , vol.9 , Issue.5-6 , pp. 453-467
- Moulines, E.¹ Charpentier, F.²

29
- 0029254176
- Transformation of formants of voice conversion using artificial neural networks
- M. Narendranath, H.A. Murthy, S. Rajendran, and B. Yegnanarayana Transformation of formants of voice conversion using artificial neural networks Speech Commun. 16 2 1995 207 216
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 207-216
- Narendranath, M.¹ Murthy, H.A.² Rajendran, S.³ Yegnanarayana, B.⁴

30
- 4544290191
- Recent advances in the automatic recognition of audiovisual speech
- G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior Recent advances in the automatic recognition of audiovisual speech Proc. IEEE 91 9 2003 1306 1326
- (2003) Proc. IEEE , vol.91 , Issue.9 , pp. 1306-1326
- Potamianos, G.¹ Neti, C.² Gravier, G.³ Garg, A.⁴ Senior, A.W.⁵

31
- 0004244302
- Englewood Cliffs
- L.R. Rabiner, and B.H. Juang Fundamentals of Speech Recognition 1993 Englewood Cliffs
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.R.¹ Juang, B.H.²

32
- 0003425258
- Englewood Cliffs
- Rabiner, L.R.; Schafer, R.W.; 1978. Digital Processing of Speech Signal, Englewood Cliffs.
- (1978) Digital Processing of Speech Signal
- Rabiner L., .R.¹ Schafer R., .W.²

33
- 77950029338
- Voice conversion by mapping the speaker-specific features using pitch synchronous approach
- K.S. Rao Voice conversion by mapping the speaker-specific features using pitch synchronous approach Comput. Speech Lang. 24 2010 474 494
- (2010) Comput. Speech Lang. , vol.24 , pp. 474-494
- Rao, K.S.¹

34
- 0029209272
- Robust text-independent speaker identification using Gaussian mixture speaker models
- D.A. Reynolds, and R.C. Rose Robust text-independent speaker identification using Gaussian mixture speaker models IEEE Trans. Acoust. Speech Signal Process. 3 1 1995 72 83
- (1995) IEEE Trans. Acoust. Speech Signal Process. , vol.3 , Issue.1 , pp. 72-83
- Reynolds, D.A.¹ Rose, R.C.²

35
- 84859768504
- Statistical voice conversion based on noisy channel model
- D. Saito, S. Watanabe, A. Nakamura, and N. Minematsu Statistical voice conversion based on noisy channel model IEEE Trans. Acoust. Speech Lang. Process. 20 6 2012 1784 1794
- (2012) IEEE Trans. Acoust. Speech Lang. Process. , vol.20 , Issue.6 , pp. 1784-1794
- Saito, D.¹ Watanabe, S.² Nakamura, A.³ Minematsu, N.⁴

36
- 0001503040
- Voice personality transformation
- M. Savic, and I.H. Nam Voice personality transformation Digital Signal Process. 4 1991 107 110
- (1991) Digital Signal Process. , vol.4 , pp. 107-110
- Savic, M.¹ Nam, I.H.²

37
- 51449112440
- Voice conversion by combining frequency warping with unit selection
- Shuang, Z.; Meng, F.; Qin, Y.; 2008. Voice conversion by combining frequency warping with unit selection. In: Proc. ICASSP, pp. 4661-4664.
- (2008) Proc. ICASSP , pp. 4661-4664
- Shuang, Z.¹ Meng, F.² Qin, Y.³

38
- 0032026483
- Continuous probabilistic transform for voice conversion
- PII S1063667698017386
- Y. Stylianou, O. Cappe, and E. Moulines Continuous probabilistic transform for voice conversion IEEE Trans. Acoust. Speech Signal Process. 6 2 1998 131 142 (Pubitemid 128720639)
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

39
- 0027128576
- Lipreading and audio-visual speech perception
- A.Q. Summerfield Lipreading and audio-visual speech perception Philos. Trans. R. Soc. Lond. B 335 1992 71 78
- (1992) Philos. Trans. R. Soc. Lond. B , vol.335 , pp. 71-78
- Summerfield, A.Q.¹

40
- 33846195493
- Residual prediction based on unit selection
- Sundermann, D.; Hoge, H.; Bonafonte, A.; Ney, H.; Black, A.W.; 2005. Residual prediction based on unit selection. In: Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 369-374.
- (2005) Proc. IEEE Workshop on Automatic Speech Recognition and Understanding , pp. 369-374
- Sundermann, D.¹ Hoge, H.² Bonafonte, A.³ Ney, H.⁴ Black A., .W.⁵

41
- 33947623206
- Text-independent voice conversion based on unit selection
- Sundermann, D.; Hoge, H.; Bonafonte, A.; Ney, H.; Black, A.W.; Narayanan, S.; 2006. Text-independent voice conversion based on unit selection. In: Proc. ICASSP, pp. 14-19.
- (2006) Proc. ICASSP , pp. 14-19
- Sundermann, D.¹ Hoge, H.² Bonafonte, A.³ Ney, H.⁴ Black A., .W.⁵ Narayanan, S.⁶

42
- 57749193836
- Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- T. Toda, A.W. Black, Black, and K. Tokuda Voice conversion based on maximum likelihood estimation of spectral parameter trajectory IEEE Trans. Acoust. Speech Lang. Process. 15 8 2007 2222 2235
- (2007) IEEE Trans. Acoust. Speech Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.²

43
- 0026880275
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and J.P. Tubach Voice transformation using PSOLA technique Speech Commun. 11 1992 175 187 (Pubitemid 23572497)
- (1992) Speech Communication , vol.11 , Issue.2-3 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.P.³

44
- 4143120860
- Speech recognition experiments with linear prediction, bandpass filtering, and dynamic programming
- G.M. White, and R.B. Neely Speech recognition experiments with linear prediction, bandpass filtering, and dynamic programming IEEE Trans. Acoust. Speech Signal Process. 24 2 1976 183 188
- (1976) IEEE Trans. Acoust. Speech Signal Process. , vol.24 , Issue.2 , pp. 183-188
- White, G.M.¹ Neely, R.B.²

45
- 34047254509
- Quality-enhanced voice morphing using maximum likelihood transformations
- DOI 10.1109/TSA.2005.860839
- H. Ye, and S. Young Quality-enhanced voice morphing using maximum likelihood transformations IEEE Trans. Audio Speech Lang. Process. 14 4 2006 1301 1312 (Pubitemid 46547625)
- (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.4 , pp. 1301-1312
- Ye, H.¹ Young, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.