SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 5, 2010, Pages 965-973

Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques

(2) Türk, Oytun a Schröder, Marc b

a Sensory Inc (United States)

b GERMAN RESEARCH CENTER FOR ARTIFICIAL INTELLIGENCE DFKI (Germany)

Author keywords

Expressive speech synthesis; Prosody; Voice conversion; Voice quality transformation

Indexed keywords

APPROXIMATE MODEL; COMBINED MODELING; EXPRESSIVE SPEECH; EXPRESSIVE SPEECH SYNTHESIS; FACTORIAL DESIGN; LISTENING TESTS; OPEN SOURCES; RELATIVE CONTRIBUTION; SIGNAL MANIPULATION; SYNTHETIC SPEECH; TEXT TO SPEECH; TRANSFORMATION ALGORITHM; UNIT SELECTION; VOCAL-TRACTS; VOICE CONVERSION; VOICE QUALITY;

WAVELET TRANSFORMS;

SPEECH SYNTHESIS;

EID: 77953699443 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2041113 Document Type: Article

Times cited : (61)

References (47)

1
- 52149119160
- IDEAS4Games: Building expressive virtual characters for computer games
- Tokyo, Japan
- P. Gebhard, M. Schröder, M. Charfuelan, C. Endres, M. Kipp, S. Pammi, M. Rumpler, and O. Türk, "IDEAS4Games: Building expressive virtual characters for computer games," in Proc. IVA 2008, Tokyo, Japan, pp. 426-440.
- Proc. IVA 2008 , pp. 426-440
- Gebhard, P.¹ Schröder, M.² Charfuelan, M.³ Endres, C.⁴ Kipp, M.⁵ Pammi, S.⁶ Rumpler, M.⁷ Türk, O.⁸

2
- 1142294500
- Limited domain synthesis of expressive military speech for animated characters
- Denver, CO
- W. L. Johnson, S. S. Narayanan, R. Whitney, R. Das, M. L. Bulut, and C. LaBore, "Limited domain synthesis of expressive military speech for animated characters," in Proc. ICSLP 2002, Denver, CO.
- Proc. ICSLP 2002
- Johnson, W.L.¹ Narayanan, S.S.² Whitney, R.³ Das, R.⁴ Bulut, M.L.⁵ Labore, C.⁶

3
- 84966398940
- Optimising selection of units from speech databases for concatenative synthesis
- Madrid, Spain
- A. W. Black and N. Campbell, "Optimising selection of units from speech databases for concatenative synthesis," in Proc. Eurospeech, Madrid, Spain, 1995, pp. 581-584.
- (1995) Proc. Eurospeech , pp. 581-584
- Black, A.W.¹ Campbell, N.²

4
- 85006631929
- Unit selection and emotional speech
- A. W. Black, "Unit selection and emotional speech," in Proc. Eurospeech, Geneva, Switzerland, 2003.
- (2003) Proc. Eurospeech, Geneva, Switzerland
- Black, A.W.¹

5
- 0142153901
- Speech database design for aconcatenative text-to-speech synthesis system for individuals with communication disorders
- A. Iida and N. Campbell, "Speech database design for aconcatenative text-to-speech synthesis system for individuals with communication disorders," Int. J. Speech Technol., vol.6, pp. 379-392, 2003.
- (2003) Int. J. Speech Technol. , vol.6 , pp. 379-392
- Iida, A.¹ Campbell, N.²

6
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- Budapest, Hungary
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. Eurospeech, Budapest, Hungary, 1999.
- (1999) Proc. Eurospeech
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

7
- 84982961818
- Constructing emotional speech synthesizers with limited speech database
- Jeju, Korea
- R. Tsuzuki, H. Zen, K. Tokuda, T. Kitamura, M. Bulut, and S. S. Narayanan, "Constructing emotional speech synthesizers with limited speech database," in Proc. ICSLP, Jeju, Korea, 2004.
- (2004) Proc. ICSLP
- Tsuzuki, R.¹ Zen, H.² Tokuda, K.³ Kitamura, T.⁴ Bulut, M.⁵ Narayanan, S.S.⁶

8
- 34547529978
- Model adaptation approach to speech synthesis with diverse voices and styles
- Honolulu, Hawaii
- J. Yamagishi, T. Kobayashi, M. Tachibana, K. Ogata, and Y. Nakano, "Model adaptation approach to speech synthesis with diverse voices and styles," in Proc. ICASSP, Honolulu, Hawaii, 2007, pp. 1233-1236.
- (2007) Proc. ICASSP , pp. 1233-1236
- Yamagishi, J.¹ Kobayashi, T.² Tachibana, M.³ Ogata, K.⁴ Nakano, Y.⁵

9
- 51449098017
- Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis
- Las Vegas, NV
- M. Tachibana, S. Izawa, T. Nose, and T. Kobayashi, "Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis," in Proc. ICASSP, Las Vegas, NV, pp. 4633-4636.
- Proc. ICASSP , pp. 4633-4636
- Tachibana, M.¹ Izawa, S.² Nose, T.³ Kobayashi, T.⁴

10
- 85009069226
- A style control technique for HMM-based speech synthesis
- Jeju, Korea
- K. Miyanaga, T. Masuko, and T. Kobayashi, "A style control technique for HMM-based speech synthesis," in Proc. ICSLP, Jeju, Korea, 2004.
- (2004) Proc. ICSLP
- Miyanaga, K.¹ Masuko, T.² Kobayashi, T.³

11
- 34547529063
- A style control technique for speech synthesis using multiple regression HSMM
- Pittsburgh, PA, USA
- T. Nose, J. Yamagishi, and T. Kobayashi, "A style control technique for speech synthesis using multiple regression HSMM," in Proc. INTERSPEECH 2006, Pittsburgh, PA, USA.
- Proc. INTERSPEECH 2006
- Nose, T.¹ Yamagishi, J.² Kobayashi, T.³

12
- 67650790758
- The blizzard challenge 2008
- Brisbane, Australia
- V. Karaiskos, S. King, R. A. J. Clark, and C. Mayo, "The Blizzard Challenge 2008," in Proc. Blizzard Challenge 2008, Brisbane, Australia.
- Proc. Blizzard Challenge 2008
- Karaiskos, V.¹ King, S.² Clark, R.A.J.³ Mayo, C.⁴

13
- 0023739214
- Voice conversion through vector quantization
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in Proc. IEEE ICASSP, 1988, pp. 565-568.
- (1988) Proc. IEEE ICASSP , pp. 565-568
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

14
- 0033154052
- Speaker transformation algorithm using segmental codebooks
- L. M. Arslan, "Speaker transformation algorithm using segmental codebooks," Speech Commun., vol.28, pp. 211-226, 1999.
- (1999) Speech Commun. , vol.28 , pp. 211-226
- Arslan, L.M.¹

15
- 0032026483
- Continuous probabilistic transform for voice conversion
- Mar.
- Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech Audio Process., vol.6, no.2, pp. 131-142, Mar. 1998.
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

16
- 4444285698
- Ph.D. dissertation, OGI School of Sci. and Eng., Oregon Health and Sci. Univ., Beaverton
- A. B. Kain, "High resolution voice transformation," Ph.D. dissertation, OGI School of Sci. and Eng., Oregon Health and Sci. Univ., Beaverton, 2001.
- (2001) High Resolution Voice Transformation
- Kain, A.B.¹

17
- 77950029784
- Ph.D. dissertation, Bo?gaziçi Univ., Istanbul, Turkey
- O. Türk, "Cross-lingual voice conversion," Ph.D. dissertation, Bo?gaziçi Univ., Istanbul, Turkey, 2007.
- (2007) Cross-lingual Voice Conversion
- Türk, O.¹

18
- 70349197691
- Voice conversion using artificial neural networks
- Taipei, Taiwan, Apr.
- S. Desai, E. V. Raghavendra, B. Yegnanarayana, A. W. Black, and K. Prahallad, "Voice conversion using artificial neural networks," in Proc. IEEE ICASSP, Taipei, Taiwan, Apr. 2009.
- (2009) Proc. IEEE ICASSP
- Desai, S.¹ Raghavendra, E.V.² Yegnanarayana, B.³ Black, A.W.⁴ Prahallad, K.⁵

19
- 85135141647
- Hidden markov model based voice conversion using dynamic characteristics of speaker
- E.-K. Kim, S. Lee, and Y.-H. Oh, "Hidden markov model based voice conversion using dynamic characteristics of speaker," in Proc. Eurospeech, 1997, pp. 2519-2522.
- (1997) Proc. Eurospeech , pp. 2519-2522
- Kim, E.-K.¹ Lee, S.² Oh, Y.-H.³

20
- 85009250849
- Subband based voice conversion
- CO, Sep.
- O. Türk and L. M. Arslan, "Subband based voice conversion," in Proc. ICSLP, Denver, CO, Sep. 2002, vol.1, pp. 289-292.
- (2002) Proc. ICSLP, Denver , vol.1 , pp. 289-292
- Türk, O.¹ Arslan, L.M.²

21
- 70349207267
- Application of voice conversion for cross-language rap singing transformation
- Taipei, Taiwan, Apr.
- O. Türk, O. Büyük, A. Haznedaroglu, and L. M. Arslan, "Application of voice conversion for cross-language rap singing transformation," in Proc. IEEE ICASSP, Taipei, Taiwan, Apr. 2009.
- (2009) Proc. IEEE ICASSP
- Türk, O.¹ Büyük, O.² Haznedaroglu, A.³ Arslan, L.M.⁴

22
- 84938935270
- A system for transforming the emotion in speech: Combining data-driven conversion techniques for prosody and voice quality
- Antwerp, Belgium, Aug. 27-31
- Z. Inanoglu and S. J. Young, "A system for transforming the emotion in speech: Combining data-driven conversion techniques for prosody and voice quality," in Proc. Interspeech, Antwerp, Belgium, Aug. 27-31, 2007.
- (2007) Proc. Interspeech
- Inanoglu, Z.¹ Young, S.J.²

23
- 70349200844
- Voice conversion for various types of body transmitted speech
- Taipei, Taiwan, Apr.
- T. Toda, K. Nakamura, H. Sekimoto, and K. Shikano, "Voice conversion for various types of body transmitted speech," in Proc. IEEE ICASSP, Taipei, Taiwan, Apr. 2009.
- (2009) Proc. IEEE ICASSP
- Toda, T.¹ Nakamura, K.² Sekimoto, H.³ Shikano, K.⁴

24
- 84869508926
- A voice conversion method based on joint pitch and spectral envelope transformation
- Jeju, Korea
- T. En-Najjary, O. Rosec, and T. Chonavel, "A voice conversion method based on joint pitch and spectral envelope transformation," in Proc. 8th Int. Conf. Spoken Lang. Process., Jeju, Korea, 2004.
- (2004) Proc. 8th Int. Conf. Spoken Lang. Process.
- En-Najjary, T.¹ Rosec, O.² Chonavel, T.³

25
- 84867219635
- A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis
- Brisbane, Australia
- O. Türk and M. Schröder, "A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis," in Proc. Interspeech, Brisbane, Australia, 2008, pp. 2282-2285.
- (2008) Proc. Interspeech , pp. 2282-2285
- Türk, O.¹ Schröder, M.²

26
- 0032629673
- Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis
- Phoenix, AZ
- Y. Stylianou, "Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis," in Proc. IEEE ICASSP, Phoenix, AZ, 1999.
- (1999) Proc. IEEE ICASSP
- Stylianou, Y.¹

27
- 33646769932
- Polyglot synthesis using amixture of monolingual corpora
- J. Latorre, K. Iwano, and S. Furui, "Polyglot synthesis using amixture of monolingual corpora," in Proc. IEEE ICASSP, 2005, vol.1, pp. 1-4.
- (2005) Proc. IEEE ICASSP , vol.1 , pp. 1-4
- Latorre, J.¹ Iwano, K.² Furui, S.³

28
- 33745191648
- Emotional festival-Mbrola TTS synthesis
- Lisbon, Portugal
- F. Tesser, P. Cosi, C. Drioli, and G. Tisato, "Emotional festival-Mbrola TTS synthesis," in Proc. Interspeech, Lisbon, Portugal, 2005.
- (2005) Proc. Interspeech
- Tesser, F.¹ Cosi, P.² Drioli, C.³ Tisato, G.⁴

29
- 0027839344
- Text-to-speech synthesis based on aMBE re-synthesis of segments database
- T. Dutoit and H. Leich, "Text-to-speech synthesis based on aMBE re-synthesis of segments database," Speech Commun., vol.13, pp. 435-440.
- Speech Commun. , vol.13 , pp. 435-440
- Dutoit, T.¹ Leich, H.²

30
- 85009141811
- Improvement in corpus-based generation of f0 contours using generation process model for emotional speech synthesis
- K. Hirose, "Improvement in corpus-based generation of f0 contours using generation process model for emotional speech synthesis," in Proc. Interspeech, 2004, pp. 1349-1352.
- (2004) Proc. Interspeech , pp. 1349-1352
- Hirose, K.¹

31
- 85133428264
- Towards emotional speech synthesis: A rule based approach
- Jun.
- E. Zovato, A. Pacchiotti, S. Quazza, and S. Sandri, "Towards emotional speech synthesis: A rule based approach," in Proc. 5th ISCA Speech Synth. Workshop, Jun. 2004.
- (2004) Proc. 5th ISCA Speech Synth. Workshop
- Zovato, E.¹ Pacchiotti, A.² Quazza, S.³ Sandri, S.⁴

32
- 39649107657
- Content-based transformation of the expressivity in speech
- Saarbrücken, Germany, Aug.
- G. Beller and X. Rodet, "Content-based transformation of the expressivity in speech," in Proc. 16th Int. Congr. Phonetic Sci., Saarbrücken, Germany, Aug. 2007, pp. 2157-2160.
- (2007) Proc. 16th Int. Congr. Phonetic Sci. , pp. 2157-2160
- Beller, G.¹ Rodet, X.²

33
- 33646791479
- Prosody analysis and modeling for emotional speech synthesis
- Mar.
- D. Jiang, W. Zhang, L. Shen, and L. Cai, "Prosody analysis and modeling for emotional speech synthesis," in Proc. IEEE ICASSP, Mar. 2005, vol.1, pp. 281-284.
- (2005) Proc. IEEE ICASSP , vol.1 , pp. 281-284
- Jiang, D.¹ Zhang, W.² Shen, L.³ Cai, L.⁴

34
- 34547537460
- Investigating the role of phoneme-level modifications in emotional speech resynthesis
- Lisbon, Portugal
- M. Bulut, C. Busso, S. Yildirim, A. Kazemzadeh, M. C. Lee, S. Lee, and S. Narayanan, "Investigating the role of phoneme-level modifications in emotional speech resynthesis," in Proc. Interspeech, Lisbon, Portugal, 2005.
- (2005) Proc. Interspeech
- Bulut, M.¹ Busso, C.² Yildirim, S.³ Kazemzadeh, A.⁴ Lee, M.C.⁵ Lee, S.⁶ Narayanan, S.⁷

35
- 34547519038
- A statistical approach for modeling prosody features using POS tags for emotional speech synthesis
- Honolulu, HI, Apr.
- M. Bulut, S. Lee, and S. Narayanan, "A statistical approach for modeling prosody features using POS tags for emotional speech synthesis," in Proc. IEEE ICASSP, Honolulu, HI, Apr. 2007, vol.4, pp. 1237-1240.
- (2007) Proc. IEEE ICASSP , vol.4 , pp. 1237-1240
- Bulut, M.¹ Lee, S.² Narayanan, S.³

36
- 33746653351
- Robust processing techniques for voice conversion
- O. Türk and L. M. Arslan, "Robust processing techniques for voice conversion," Comput. Speech Lang., vol.20, pp. 441-467, 2006.
- (2006) Comput. Speech Lang. , vol.20 , pp. 441-467
- Türk, O.¹ Arslan, L.M.²

37
- 0009151070
- Time-domain and frequency-domain techniques for prosodic modification of speech
- Kleijn and Paliwal, Eds. Amsterdam, The Netherlands: Elsevier
- E. Moulines and W. Verhelst, "Time-domain and frequency-domain techniques for prosodic modification of speech," in Speech Coding and Synthesis, Kleijn and Paliwal, Eds. Amsterdam, The Netherlands: Elsevier, 1995, pp. 519-555.
- (1995) Speech Coding and Synthesis , pp. 519-555
- Moulines, E.¹ Verhelst, W.²

38
- 84876497245
- GMM-based voice conversion applied to emotional speech synthesis
- H. Kawanami, Y. Iwami, T. Toda, H. Saruwatari, and K. Shikano, "GMM-based voice conversion applied to emotional speech synthesis," in Proc. Eurospeech, 2003, pp. 2401-2404.
- (2003) Proc. Eurospeech , pp. 2401-2404
- Kawanami, H.¹ Iwami, Y.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

39
- 58149203393
- Data-driven emotion conversion in spoken English
- Mar.
- Z. Inanoglu and S. Young, "Data-driven emotion conversion in spoken English," Speech Commun., vol.51, no.3, pp. 268-283, Mar. 2009.
- (2009) Speech Commun. , vol.51 , Issue.3 , pp. 268-283
- Inanoglu, Z.¹ Young, S.²

40
- 52149110268
- Diploma thesis, Univ. des Saarlandes, Saarbrücken, Germany
- A. Hunecke, "Optimal design of a speech database for unit selection synthesis," Diploma thesis, Univ. des Saarlandes, Saarbrücken, Germany.
- Optimal Design of A Speech Database for Unit Selection Synthesis
- Hunecke, A.¹

41
- 70349228443
- The MARY TTS entry in the Blizzard Challenge 2008
- Brisbane, Australia
- M. Schröder, M. Charfuelan, S. Pammi, and O. Türk, "The MARY TTS entry in the Blizzard Challenge 2008," in Proc. Blizzard Challenge 2008, Brisbane, Australia.
- Proc. Blizzard Challenge 2008
- Schröder, M.¹ Charfuelan, M.² Pammi, S.³ Türk, O.⁴

42
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- A. Kain and M. Macon, "Spectral voice conversion for text-to-speech synthesis," in Proc. IEEE ICASSP, 1998, vol.1, pp. 285-288.
- (1998) Proc. IEEE ICASSP , vol.1 , pp. 285-288
- Kain, A.¹ MacOn, M.²

43
- 0026400231
- Robust and efficient quantization of speech LSP parameters using structured vector quantizers
- R. Laroia, N. Phamdo, and N. Farvardin, "Robust and efficient quantization of speech LSP parameters using structured vector quantizers," in Proc. IEEE ICASSP, 1991, pp. 641-644.
- (1991) Proc. IEEE ICASSP , pp. 641-644
- Laroia, R.¹ Phamdo, N.² Farvardin, N.³

44
- 34247610490
- A database of German emotional speech
- Lisbon, Portugal
- F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, and B. Weiss, "A database of German emotional speech," in Proc. Interspeech, Lisbon, Portugal, 2005.
- (2005) Proc. Interspeech
- Burkhardt, F.¹ Paeschke, A.² Rolfes, M.³ Sendlmeier, W.F.⁴ Weiss, B.⁵

45
- 0001884644
- Individual comparisons by ranking methods
- F. Wilcoxon, "Individual comparisons by ranking methods," Biometrics Bull. 1, pp. 80-83, 1945.
- (1945) Biometrics Bull. , vol.1 , pp. 80-83
- Wilcoxon, F.¹

46
- 0003447548
- Ph.D. dissertation, Dept. Signal, Ecole Nationale Superieure des Telecomm. , ENST-Telecom Paris, Paris, France
- Y. Stylianou, "Harmonic plus noise models for speech, combined with statistical methods for speech and speaker modification," Ph.D. dissertation, Dept. Signal, Ecole Nationale Superieure des Telecomm. , ENST-Telecom Paris, Paris, France, 1996.
- (1996) Harmonic Plus Noise Models for Speech, Combined with Statistical Methods for Speech and Speaker Modification
- Stylianou, Y.¹

47
- 0031624617
- TDPSOLA versus harmonic plus noise model in diphone based speech synthesis
- Seattle, WA
- A. Syrdal, Y. Stylianou, L. Garrison, A. Conkie, and J. Schroeter, "TDPSOLA versus harmonic plus noise model in diphone based speech synthesis," in Proc. IEEE ICASSP, Seattle, WA, 1998, pp. 273-276.
- (1998) Proc. IEEE ICASSP , pp. 273-276
- Syrdal, A.¹ Stylianou, Y.² Garrison, L.³ Conkie, A.⁴ Schroeter, J.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.