메뉴 건너뛰기




Volumn 56, Issue 9, 2007, Pages 1245-1254

Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion

Author keywords

Emotional text to speech synthesis; Emotional voice conversion; Function clustering and selection; Gaussian mixture bigram model; Linguistic feature

Indexed keywords

DATABASE SYSTEMS; LINGUISTICS; SPEECH ANALYSIS; STATISTICAL TESTS;

EID: 34548216761     PISSN: 00189340     EISSN: None     Source Type: Journal    
DOI: 10.1109/TC.2007.1079     Document Type: Article
Times cited : (24)

References (26)
  • 1
    • 0027447292 scopus 로고
    • Towards the Simulation of Emotion in Synthetic Speech: A Review of the Literature on Human Vocal Emotion
    • I.R. Murray and J.L. Arnott, "Towards the Simulation of Emotion in Synthetic Speech: A Review of the Literature on Human Vocal Emotion," J. Acoustic Soc. Am., vol. 93, no. 2, pp. 1097-1108, 1993.
    • (1993) J. Acoustic Soc. Am , vol.93 , Issue.2 , pp. 1097-1108
    • Murray, I.R.1    Arnott, J.L.2
  • 3
    • 0037380318 scopus 로고    scopus 로고
    • A Corpus-Based Speech Synthesis System with Emotion
    • A. Iida, F. Higuchi, N. Campbell, and M. Yasumura, "A Corpus-Based Speech Synthesis System with Emotion," Speech Comm., vol. 40, nos. 1-2, pp. 161-187, 2003.
    • (2003) Speech Comm , vol.40 , Issue.1-2 , pp. 161-187
    • Iida, A.1    Higuchi, F.2    Campbell, N.3    Yasumura, M.4
  • 9
    • 33646779506 scopus 로고    scopus 로고
    • Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter
    • Mar
    • T. Toda, A.W. Black, and K. Tokuda, "Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter," Proc. Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '05), vol. 1, pp. 9-12, Mar. 2005.
    • (2005) Proc. Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '05) , vol.1 , pp. 9-12
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 12
    • 34047247202 scopus 로고    scopus 로고
    • Voice Conversion Using Duration-Embedded Bi-HMMs for Expressive Speech Synthesis
    • C.H. Wu, C.C. Hsia, T.H. Liu, and J.F. Wang, "Voice Conversion Using Duration-Embedded Bi-HMMs for Expressive Speech Synthesis," IEEE Trans. Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1109-1116, 2006.
    • (2006) IEEE Trans. Audio, Speech, and Language Processing , vol.14 , Issue.4 , pp. 1109-1116
    • Wu, C.H.1    Hsia, C.C.2    Liu, T.H.3    Wang, J.F.4
  • 13
    • 0036497598 scopus 로고    scopus 로고
    • Discriminative Training of Gaussian Mixture Bigram Models with Application to Chinese Dialect Identification
    • W.H. Tsai and W.W. Chang, "Discriminative Training of Gaussian Mixture Bigram Models with Application to Chinese Dialect Identification," Speech Comm., vol. 36, no. 3-4, pp. 317-326, 2002.
    • (2002) Speech Comm , vol.36 , Issue.3-4 , pp. 317-326
    • Tsai, W.H.1    Chang, W.W.2
  • 14
    • 0001927585 scopus 로고
    • On Information and Sufficiency
    • Mar
    • S. Kullback and R.A. Leibler, "On Information and Sufficiency," Annals of Math. Statistics, vol. 22, no. 1, pp. 79-86, Mar. 1951.
    • (1951) Annals of Math. Statistics , vol.22 , Issue.1 , pp. 79-86
    • Kullback, S.1    Leibler, R.A.2
  • 15
    • 0035478985 scopus 로고    scopus 로고
    • Automatic Generation of Synthesis Units and Prosodic Information for Chinese Concatenative Synthesis
    • C.H. Wu and J.H. Chen, "Automatic Generation of Synthesis Units and Prosodic Information for Chinese Concatenative Synthesis," Speech Comm., vol. 35, nos. 3-4, pp. 219-237, 2001.
    • (2001) Speech Comm , vol.35 , Issue.3-4 , pp. 219-237
    • Wu, C.H.1    Chen, J.H.2
  • 16
    • 0030677481 scopus 로고    scopus 로고
    • Speech Representation and Transformation Using Adaptive Interpolation of Weighted Spectrum: Vocoder Revisited
    • H. Kawahara, "Speech Representation and Transformation Using Adaptive Interpolation of Weighted Spectrum: Vocoder Revisited," Proc. Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '97), vol. 2, pp. 1303-1306, 1997.
    • (1997) Proc. Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '97) , vol.2 , pp. 1303-1306
    • Kawahara, H.1
  • 17
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring Speech Representations Using a Pitch Adaptive Time-Frequency-Based F0 Extraction: Possible Role of a Repetitive Structure in Sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring Speech Representations Using a Pitch Adaptive Time-Frequency-Based F0 Extraction: Possible Role of a Repetitive Structure in Sounds," Speech Comm., vol. 27, nos. 3-4, pp. 187-207, 1999.
    • (1999) Speech Comm , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    de Cheveigné, A.3
  • 18
    • 0002629270 scopus 로고
    • Maximum Likelihood from Incomplete Data via the EM Algorithm
    • A.P. Dempster, N.M. Laird, and D.B. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm," J. Royal Statistical Soc. B vol. 39, pp. 1-38, 1977.
    • (1977) J. Royal Statistical Soc. B , vol.39 , pp. 1-38
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 20
    • 0032073761 scopus 로고    scopus 로고
    • An RNN-Based Prosodic Information Synthesis for Mandarin Text-to-Speech
    • S.H. Chen, S.H. Hwang, and Y.R. Wang, "An RNN-Based Prosodic Information Synthesis for Mandarin Text-to-Speech," IEEE Trans. Speech and Audio Processing, vol. 6, no. 3, pp. 226-239, 1998.
    • (1998) IEEE Trans. Speech and Audio Processing , vol.6 , Issue.3 , pp. 226-239
    • Chen, S.H.1    Hwang, S.H.2    Wang, Y.R.3
  • 21
    • 29144503308 scopus 로고
    • Part-of-Speech (POS) Analysis on Chinese Language
    • Inst. of Information Science Academia Sinica
    • L.L. Chang et al., "Part-of-Speech (POS) Analysis on Chinese Language," technical report, Inst. of Information Science Academia Sinica, 1989.
    • (1989) technical report
    • Chang, L.L.1
  • 22
    • 2942538615 scopus 로고    scopus 로고
    • Recovery of False Rejection Using Statistical Partial Pattern Trees for Sentence Verification
    • C.H. Wu and Y.J. Chen, "Recovery of False Rejection Using Statistical Partial Pattern Trees for Sentence Verification," Speech Comm., vol. 43, pp. 71-88, 2004.
    • (2004) Speech Comm , vol.43 , pp. 71-88
    • Wu, C.H.1    Chen, Y.J.2
  • 23
    • 34047263010 scopus 로고    scopus 로고
    • Prosody Conversion from Neutral Speech to Emotional Speech
    • J. Tao, Y. Kang, and A. Li, "Prosody Conversion from Neutral Speech to Emotional Speech," IEEE Trans. Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1145-1154, 2006.
    • (2006) IEEE Trans. Audio, Speech, and Language Processing , vol.14 , Issue.4 , pp. 1145-1154
    • Tao, J.1    Kang, Y.2    Li, A.3
  • 24
    • 21844454654 scopus 로고    scopus 로고
    • The Determination, Analysis, and Synthesis of Fundamental Frequency,
    • PhD dissertation, Northwestern Univ
    • X. Sun, "The Determination, Analysis, and Synthesis of Fundamental Frequency," PhD dissertation, Northwestern Univ., 2002.
    • (2002)
    • Sun, X.1
  • 25
    • 0000873069 scopus 로고
    • A Method for the Solution of Certain Problems in Least Squares
    • K. Levenberg, "A Method for the Solution of Certain Problems in Least Squares," Quarterly Applied Math., vol. 2, pp. 164-168, 1944.
    • (1944) Quarterly Applied Math , vol.2 , pp. 164-168
    • Levenberg, K.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.