SCOPUS 정보 검색 플랫폼

Tien Tzu Hsueh Pao/Acta Electronica Sinica

Volumn 32, Issue 7, 2004, Pages 1165-1172

Voice conversion technology and its development

(3) Zuo, Guo Yu a,b Liu, Wen Ju a Ruan, Xiao Gang b

b BEIJING UNIVERSITY OF TECHNOLOGY (China)

Author keywords

Artificial neural network; Codebook mapping; Gaussian mixture model; Glottal excitation; Hidden Markov model; Pitch contour; Speech spectrum; Voice conversion

Indexed keywords

ALGORITHMS; MARKOV PROCESSES; MATHEMATICAL MODELS; NEURAL NETWORKS; SPEECH CODING;

CODEBOOK MAPPING; GAUSSIAN MIXTURE MODELS; GLOTTAL EXCITATION; HIDDEN MARKOV MODELS; PITCH CONTOUR; SPEECH SPECTRUM; VOICE CONVERSION;

SPEECH RECOGNITION;

EID: 5444260705 PISSN: 03722112 EISSN: None Source Type: Journal
DOI: None Document Type: Article

Times cited : (7)

References (56)

1
- 58149209073
- Voice conversion: State of the art and perspectives
- E Moulines and Y Sagisaka. Voice conversion: state of the art and perspectives [J]. Speech Communication. 1995, 16(2): 125-126.
- (1995) Speech Communication , vol.16 , Issue.2 , pp. 125-126
- Moulines, E.¹ Sagisaka, Y.²

2
- 0034842740
- Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
- Salt Lake City, USA: IEEE
- M Tamura, et al. Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR [A]. Proc ICASSP [C]. Salt Lake City, USA: IEEE, May 2001. 805-808.
- (2001) Proc ICASSP , pp. 805-808
- Tamura, M.¹

3
- 84905560807
- Voice conversion with smoothed GMM and MAP adaptation
- Geneva, Switzerland: ISCA
- Y Chen, M Chu, et al. Voice Conversion with Smoothed GMM and MAP Adaptation [A]. Proc Eurospeech [C]. Geneva, Switzerland: ISCA, Sept. 2003: 2413-2416.
- (2003) Proc Eurospeech , pp. 2413-2416
- Chen, Y.¹ Chu, M.²

4
- 1842369784
- Glottal waveform parameters for different speakers types
- Edinburgh, Scotland: IOA
- I Karlsson. Glottal waveform parameters for different speakers types [A]. Proc Speech'88, 7th FASE Symp [C]. Edinburgh, Scotland: IOA, 1988. 225-231.
- (1988) Proc Speech'88, 7th FASE Symp , pp. 225-231
- Karlsson, I.¹

5
- 0016939145
- Automatic recognition of speaker from their voices
- B S Atal. Automatic recognition of speaker from their voices [J]. Proceedings of the IEEE, April 1976, 64(4): 460-475.
- (1976) Proceedings of the IEEE , vol.64 , Issue.4 , pp. 460-475
- Atal, B.S.¹

6
- 0015112070
- Speech analysis and synthesis by linear prediction of the speech wave
- B S Atal, S L Hanauer. Speech analysis and synthesis by linear prediction of the speech wave [J]. J Acoust Soc Am, 1971, 50(2): 637-655.
- (1971) J Acoust Soc Am , vol.50 , Issue.2 , pp. 637-655
- Atal, B.S.¹ Hanauer, S.L.²

7
- 0020166649
- System to independently modify excitation and/or spectrum of speech waveform without explicit pitch extraction
- S Seneff. System to independently modify excitation and/or spectrum of speech waveform without explicit pitch extraction [J]. IEEE Trans Acoust Speech Sig, 30(4), 1982: 566-578.
- (1982) IEEE Trans Acoust Speech Sig , vol.30 , Issue.4 , pp. 566-578
- Seneff, S.¹

8
- 0023246352
- Factors in voice quality: Acoustic features related to gender
- New York, USA: IEEE
- D G Childers, et al. Factors in voice quality: Acoustic features related to gender [A]. Proc ICASSP [C]. New York, USA: IEEE, 1987. 293-296.
- (1987) Proc ICASSP , pp. 293-296
- Childers, D.G.¹

9
- 0024680919
- Voice conversion
- D G Childers, et al. Voice Conversion [J]. Speech Communication, 1989, 8(2): 147-158.
- (1989) Speech Communication , vol.8 , Issue.2 , pp. 147-158
- Childers, D.G.¹

10
- 0023739214
- Voice conversion through vector quantization
- New York, USA: IEEE
- M Abe, et al. Voice Conversion through Vector Quantization [A]. Proc ICASSP [C]. New York, USA: IEEE, 1988(1). 655-658.
- (1988) Proc ICASSP , Issue.1 , pp. 655-658
- Abe, M.¹

11
- 0029251946
- Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks
- N Iwahashi, et al. Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks [J]. Speech Communication. 1995, 16(2): 139-151.
- (1995) Speech Communication , vol.16 , Issue.2 , pp. 139-151
- Iwahashi, N.¹

12
- 0030359624
- Voice conversion based on topological feature maps and time-variant filtering
- Philadelphia, USA: ESCA
- A Rinscheid. Voice conversion based on topological feature maps and time-variant filtering [A]. Proc ICSLP [C]. Philadelphia, USA: ESCA, Oct. 1996.1445-1448.
- (1996) Proc ICSLP , pp. 1445-1448
- Rinscheid, A.¹

13
- 0026880275
- Voice transformation using PSOLA technique
- H Valbret, et al. Voice transformation using PSOLA technique [J]. Speech Communication, 1992,11(2-3): 175-187.
- (1992) Speech Communication , vol.11 , Issue.2-3 , pp. 175-187
- Valbret, H.¹

14
- 0029254176
- Transformation of formants for voice conversion using artificial neural networks
- M Narendranath, et al. Transformation of formants for voice conversion using artificial neural networks [J]. Speech Communication, 1995, 16(2): 207-216.
- (1995) Speech Communication , vol.16 , Issue.2 , pp. 207-216
- Narendranath, M.¹

15
- 85009266993
- Transformation of spectral envelope for voice conversion based on radial basis function networks
- Denver, USA: ISCA
- T Watanabe, et al. Transformation of spectral envelope for voice conversion based on radial basis function networks [A]. Proc ICSLP'2002 [C]. Denver, USA: ISCA, Sept. 2002.285-288.
- (2002) Proc ICSLP'2002 , pp. 285-288
- Watanabe, T.¹

16
- 0032026483
- Continuous probabilistic transform for voice conversion
- Y Stylianou, et al. Continuous probabilistic transform for voice conversion [J]. IEEE Transactions on Speech and Audio Processing. March 1998, 6(2): 131-142.
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹

17
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- Seattle, USA: IEEE
- A Kain, M Macon. Spectral voice conversion for text-to-speech synthesis [A]. Proc ICASSP [C]. Seattle, USA: IEEE, May 1998 (1). 285-288.
- (1998) Proc ICASSP , Issue.1 , pp. 285-288
- Kain, A.¹ Macon, M.²

18
- 85009069262
- STRAIGHT-based voice conversion algorithm based on Gaussian mixture model
- Beijing, China: ESCA
- T Toda, et al. STRAIGHT-based voice conversion algorithm based on Gaussian mixture model [A]. Proc ICSLP [C]. Beijing, China: ESCA, Oct. 2000.279-282.
- (2000) Proc ICSLP , pp. 279-282
- Toda, T.¹

19
- 85135141647
- Hidden markov model based voice conversion using dynamic characteristics of speaker
- Rhodes, Greece: ESCA
- E K Kim, et al. Hidden markov model based voice conversion using dynamic characteristics of speaker [A]. Proc Eurospeech [C]. Rhodes, Greece: ESCA, 1997.2519-2522.
- (1997) Proc Eurospeech , pp. 2519-2522
- Kim, E.K.¹

20
- 5444231240
- Voice conversion based on acoustic feature transformation
- Shenzhen, China
- W Zhang, et al. Voice conversion based on acoustic feature transformation [A]. Proc NCMMSC [C]. Shenzhen, China, 2001.189-192.
- (2001) Proc NCMMSC , pp. 189-192
- Zhang, W.¹

21
- 0026372714
- Experiments with voice modeling in speech synthesis
- R Carlson, et al. Experiments with voice modeling in speech synthesis [J]. Speech Communication. 1991, 16(5-6): 481-489.
- (1991) Speech Communication , vol.16 , Issue.5-6 , pp. 481-489
- Carlson, R.¹

22
- 0026369941
- A segment-based approach to voice conversion
- Toronto, Canada: IEEE
- M Abe. A segment-based approach to voice conversion [A]. Proc ICASSP [C]. Toronto, Canada: IEEE, May 1991.765-768.
- (1991) Proc ICASSP , pp. 765-768
- Abe, M.¹

23
- 0029764985
- Speaker recognizability testing for voice coders
- Atlanta, USA: IEEE
- A Schmidt-Nielson, D P Brock. Speaker recognizability testing for voice coders [A]. Proc ICASSP [C]. Atlanta, USA: IEEE, May 1996.1149-1152.
- (1996) Proc ICASSP , pp. 1149-1152
- Schmidt-Nielson, A.¹ Brock, D.P.²

24
- 0029256373
- Acoustic characteristics of speaker individuality: Control and conversion
- H Kuwabara and Y Sagisaka. Acoustic characteristics of speaker individuality: control and conversion [J]. Speech Communication. 1995, 16(2): 165-173.
- (1995) Speech Communication , vol.16 , Issue.2 , pp. 165-173
- Kuwabara, H.¹ Sagisaka, Y.²

25
- 0025321354
- Analysis, synthesis, and perception of voice quality variations among female and male talkers
- D Klatt and L C Klatt. Analysis, synthesis, and perception of voice quality variations among female and male talkers [J]. J Acoust Soc Am, 1990, 87(2): 820-857.
- (1990) J Acoust Soc Am , vol.87 , Issue.2 , pp. 820-857
- Klatt, D.¹ Klatt, L.C.²

26
- 0027409390
- Voice source model for continuous control of pitch period
- P H Milenkovic. Voice source model for continuous control of pitch period [J]. J Acoust Soc Am, 1993, 93(2): 1087-1096.
- (1993) J Acoust Soc Am , vol.93 2 , pp. 1087-1096
- Milenkovic, P.H.¹

27
- 0015677419
- Multidimensional representation of personal quality of vowels and its acoustical correlates
- H Matsumoto, et al. Multidimensional representation of personal quality of vowels and its acoustical correlates [J]. IEEE Trans Audio and Electroacoustics, 1973, 21(5): 428-436.
- (1973) IEEE Trans Audio and Electroacoustics , vol.21 , Issue.5 , pp. 428-436
- Matsumoto, H.¹

28
- 0001174086
- Research on individuality features in speech waves and automatic speaker recognition techniques
- S Furui. Research on individuality features in speech waves and automatic speaker recognition techniques [J]. Speech Communication, 1986, 5(2): 183-197.
- (1986) Speech Communication , vol.5 , Issue.2 , pp. 183-197
- Furui, S.¹

29
- 0030365550
- A new voice transformation based on both linear and nonlinear prediction
- Philadelphia, USA: ESCA
- K S Lee, et al. A new voice transformation based on both linear and nonlinear prediction [A]. Proc ICSLP [C]. Philadelphia, USA: ESCA, 1996.1401-1404.
- (1996) Proc ICSLP , pp. 1401-1404
- Lee, K.S.¹

30
- 0033154052
- Speaker transformation algorithm using segmental code-books (STASC)
- L M Arslan. Speaker transformation algorithm using segmental code-books (STASC) [I]. Speech Communication, 1999, 28(3): 211-226.
- (1999) Speech Communication , vol.28 , Issue.3 , pp. 211-226
- Arslan, L.M.¹

31
- 0029256372
- Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt
- H Mizuno and M Abe. Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt [J]. Speech Communication. 1995, 16(2): 165-173.
- (1995) Speech Communication , vol.16 , Issue.2 , pp. 165-17
- Mizuno, H.¹ Abe, M.²

32
- 85009133605
- A new multi-speaker formant synthesizer that applies voice conversion techniques
- Aalborg, Denmark: ESCA
- J Gutirrez, et al. A new multi-speaker formant synthesizer that applies voice conversion techniques [A]. Proc Eurospeech [C]. Aalborg, Denmark: ESCA, 2001:357-360.
- (2001) Proc Eurospeech , pp. 357-360
- Gutirrez, J.¹

33
- 85135145847
- Speaker interpolation in HMM-based speech synthesis system
- Rhodes, Greece: ESCA
- T Yoshimura, et al. Speaker interpolation in HMM-based speech synthesis system [A]. Proc. Eurospeech [C]. Rhodes, Greece: ESCA, 1997.2523-2526.
- (1997) Proc. Eurospeech , pp. 2523-2526
- Yoshimura, T.¹

34
- 0029253818
- Glottal source modeling for voice conversion
- D G Childers. Glottal source modeling for voice conversion [J]. Speech Communication. 1995, 16(2): 127-138.
- (1995) Speech Communication , vol.16 , Issue.2 , pp. 127-138
- Childers, D.G.¹

35
- 0034841948
- Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction
- Salt Lake City, USA: IEEE
- A Kain, M Macon. Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction [A]. Proc ICASSP [C]. Salt Lake City, USA: IEEE, June 2001.813-816.
- (2001) Proc ICASSP , pp. 813-816
- Kain, A.¹ Macon, M.²

36
- 4444285698
- High resolution voice transformation
- SA: Oregon Health and Science University
- A Kain. High resolution voice transformation [D]. Portland, USA: Oregon Health and Science University, Oct.2001.
- (2001) Portland
- Kain, A.¹

37
- 0344557332
- Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum
- Rhodes, Greece: ESCA
- L M Arslan, D Talkin. Voice Conversion by Codebook Mapping of Line Spectral Frequencies and Excitation Spectrum [A]. Proc Eurospeech [C]. Rhodes, Greece: ESCA, 1997(3).481-489.
- (1997) Proc Eurospeech , Issue.3 , pp. 481-489
- Arslan, L.M.¹ Talkin, D.²

38
- 0022906142
- Speaker adaptation through vector quantization
- Tokyo, Japan: IEEE
- K Shikano, et al. Speaker adaptation through vector quantization [A]. Proc ICASSP [C]. Tokyo, Japan: IEEE, 1986.2643-2646.
- (1986) Proc ICASSP , pp. 2643-2646
- Shikano, K.¹

39
- 5444268579
- Spectrogram normalization using fuzzy vector quantization
- S Nakamura, K Shikano. Spectrogram normalization using fuzzy vector quantization [J]. J Acoust Soc Japan, 1989, 45(2): 107-114.
- (1989) J Acoust Soc Japan , vol.45 , Issue.2 , pp. 107-114
- Nakamura, S.¹ Shikano, K.²

40
- 0001503040
- Voice personality transformation
- M I Savic, I H Nam. Voice personality transformation [J]. Digital Signal Processing Journal. 1991, 1(2): 107-110.
- (1991) Digital Signal Processing Journal , vol.1 , Issue.2 , pp. 107-110
- Savic, M.I.¹ Nam, I.H.²

41
- 85010456787
- Spectral mapping for voice conversion using speaker selection and vector field smoothing
- Madrid, Spain: ESCA
- M Hashimoto, N Higuchi. Spectral mapping for voice conversion using speaker selection and vector field smoothing [A]. Proc Eurospeech [C]. Madrid, Spain: ESCA, 1995: 431-434.
- (1995) Proc Eurospeech , pp. 431-434
- Hashimoto, M.¹ Higuchi, N.²

42
- 0003795363
- Local models and Gaussian mixture models for statistical data processing
- Portland, USA: Oregon Graduate of Institute of Science and Technology
- N Kambhatla. Local Models and Gaussian Mixture Models for Statistical Data Processing [D]. Portland, USA: Oregon Graduate of Institute of Science and Technology, 1996.
- (1996)
- Kambhatla, N.¹

43
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C J Leggetter, P C Woodland. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models [J]. Computer Speech and Language, 1995, 9(2): 171-185.
- (1995) Computer Speech and Language , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

44
- 0026142334
- A study on speaker adaptation of the parameters of continuous density hidden markov models
- Lee Chin-Hui, et al. A study on speaker adaptation of the parameters of continuous density hidden markov models [J]. IEEE Trans on Signal Processing, 1991, 39(4): 806-814.
- (1991) IEEE Trans on Signal Processing , vol.39 , Issue.4 , pp. 806-814
- Lee, C.-H.¹

45
- 85135109228
- Speaker adaptation based on transfer vector field smoothing with continuous mixture density HMMs
- Banff, Canada: ESCA
- K Ohjura, et al. Speaker adaptation based on transfer vector field smoothing with continuous mixture density HMMs [A]. Proc ICSLP [C]. Banff, Canada: ESCA, Oct. 1992.369-372.
- (1992) Proc ICSLP , pp. 369-372
- Ohjura, K.¹

46
- 5444247189
- Voice personality transformation using an orthogonal vector space conversion
- Madrid, Spain: ESCA
- K S Lee, et al. Voice personality transformation using an orthogonal vector space conversion [A]. Proc Eurospeech [C]. Madrid, Spain: ESCA, 1995 (1). 427-430.
- (1995) Proc Eurospeech , Issue.1 , pp. 427-430
- Lee, K.S.¹

47
- 85135175982
- Statistical methods for voice quality transformation
- Madrid Spain: ESCA
- Y Stylianou, et al. Statistical methods for voice quality transformation [A]. Proc Eurospeech [C]. Madrid Spain: ESCA, 1995.447-450
- (1995) Proc Eurospeech , pp. 447-450
- Stylianou, Y.¹

48
- 5444259197
- On the construction of a pitch conversion system
- Toulouse, France: EUSIP
- T Ceyssens, et al. On the Construction of a Pitch Conversion System [A]. Proc EUSIPCO [C]. Toulouse, France: EUSIP, 2002. 1301-1304.
- (2002) Proc EUSIPCO , pp. 1301-1304
- Ceyssens, T.¹

49
- 84866347413
- Transforming voice quality
- Geneva, Switzerland: ISCA
- B Gillett, S King. Transforming voice quality [A]. Proc Eurospeech [C]. Geneva, Switzerland: ISCA, 2003.1713-1716.
- (2003) Proc Eurospeech , pp. 1713-1716
- Gillett, B.¹ King, S.²

50
- 5444243681
- Speaker-specific pitch contour modeling and modification
- Seattle, USA: IEEE
- D Chappell, J Hansen. Speaker-specific pitch contour modeling and modification [A]. Proc ICASSP [C]. Seattle, USA: IEEE, May 1998. 885-888.
- (1998) Proc ICASSP , pp. 855-888
- Chappell, D.¹ Hansen, J.²

51
- 5444225334
- New methods for voice conversion
- Istanbul, Turkey: Bog azici University
- O Turk. New Methods for Voice Conversion [D]. Istanbul, Turkey: Bog azici University, 2003.
- (2003)
- Turk, O.¹

52
- 0033693289
- Stochastic modeling of spectral adjustment for high quality pitch modification
- Istanbul, Turkey: IEEE
- A Kain, Y Stylianou. Stochastic modeling of spectral adjustment for high quality pitch modification [A]. Proc ICASSP [C]. Istanbul, Turkey: IEEE, June 2000. 949-952.
- (2000) Proc ICASSP , pp. 949-952
- Kain, A.¹ Stylianou, Y.²

53
- 0031643805
- Speaker transformation using sentence HMM based alignments and detailed prosody modification
- Seattle, USA: IEEE
- L M Arslan, et al. Speaker transformation using sentence HMM based alignments and detailed prosody modification [A]. Proc ICASSP [C]. Seattle, USA: IEEE, May 1998.289-292.
- (1998) Proc ICASSP , pp. 289-292
- Arslan, L.M.¹

54
- 0023407575
- Review of text-to-speech conversion for English
- D Klatt. Review of text-to-speech conversion for English [J]. J Acoust Soc Am. 1987, 82(3): 737-793.
- (1987) J Acoust Soc Am. , vol.82 , Issue.3 , pp. 737-793
- Klatt, D.¹

55
- 5444223461
- A global framework for the assessment of synthetic speech without subjects
- Berlin, Germany: ESCA
- A Mariniak. A global framework for the assessment of synthetic speech without subjects [A]. Proc Eurospeech [C]. Berlin, Germany: ESCA, 1993(3): 1683-1686.
- (1993) Proc Eurospeech , Issue.3 , pp. 1683-1686
- Mariniak, A.¹

56
- 84971539709
- Emotional speech synthesis - A review
- Aalborg, Denmark: ISCA
- M Schroder. Emotional speech synthesis-A review [A]. Proc Eurospeech [C]. Aalborg, Denmark: ISCA, 2001(1): 561-564.
- (2001) Proc Eurospeech , Issue.1 , pp. 561-564
- Schroder, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.