SCOPUS 정보 검색 플랫폼

IEICE Transactions on Information and Systems

Volumn E92-D, Issue 3, 2009, Pages 489-497

Hmm-based style control for expressive speech synthesis with arbitrary speaker's voice using model adaptation

(3) Nose, Takashi a Tachibana, Makoto a Kobayashi, Takao a

Author keywords

Average voice model; Expressive speech; HMM based speech synthesis; Model adaptation; Multiple regression HSMM (MRHSMM); Style control

Indexed keywords

MARKOV PROCESSES; SPEECH PROCESSING; SPEECH RECOGNITION; SPEECH SYNTHESIS;

AVERAGE VOICE MODELS; EXPRESSIVE SPEECH; HMM-BASED SPEECH SYNTHESIS; MODEL ADAPTATION; MULTIPLE REGRESSIONS;

SPEECH;

EID: 67650793657 PISSN: 09168532 EISSN: 17451361 Source Type: Journal
DOI: 10.1587/transinf.E92.D.489 Document Type: Article

Times cited : (41)

References (22)

1
- 84971539709
- Emotional speech synthesis: A review
- Sept
- M. Schröder, "Emotional speech synthesis: A review, " Proc. EU-ROSPEECH 2001, pp. 561-564, Sept. 2001.
- (2001) Proc. EU-ROSPEECH 2001 , pp. 561-564
- Schröder, M.¹

2
- 23144458652
- Expressive speech: Production, perception and application to speech synthesis
- July
- D. Erickson, "Expressive speech: Production, perception and application to speech synthesis, " Acoustical Science and Technology, vol. 26, no. 4, pp. 317-325, July 2005.
- (2005) Acoustical Science and Technology , vol.26 , Issue.4 , pp. 317-325
- Erickson, D.¹

3
- 0037380318
- A corpusbased speech synthesis system with emotion
- A. Iida, N. Campbell, F. Higuchi, and M. Yasumura, "A corpusbased speech synthesis system with emotion, " Speech Commun., vol. 40, no. 1-2, pp. 161-187, 2003.
- (2003) Speech Commun. , vol.40 , Issue.1-2 , pp. 161-187
- Iida, A.¹ Campbell, N.² Higuchi, F.³ Yasumura, M.⁴

4
- 24144497811
- Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis
- and, March
- J. Yamagishi, K. Onishi, T. Masuko, and T. Kobayashi, "Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis, " IEICE Trans. Inf. & Syst., vol. E88-D, no. 3, pp. 503-509, March 2005.
- (2005) IEICE Trans. Inf. & Syst. , vol.E88-D , Issue.3 , pp. 503-509
- Yamagishi, J.¹ Onishi, K.² Masuko, T.³ Kobayashi, T.⁴

5
- 0037382510
- Describing the emotional states that are expressed in speech
- and, April
- R. Cowie and R. R. Cornelius, "Describing the emotional states that are expressed in speech, " Speech Commun., vol. 40, no. 1-2, pp. 5-32, April 2003.
- (2003) Speech Commun. , vol.40 , Issue.1-2 , pp. 5-32
- Cowie, R.¹ Cornelius, R.R.²

6
- 85009067746
- Analytical and perceptual study on the role of acoustic features in realizing emotional speech
- and, Oct
- K. Hirose, N. Minematsu, and H. Kawanami, "Analytical and perceptual study on the role of acoustic features in realizing emotional speech, " Proc. ICSLP 2000, pp. 369-372, Oct. 2000.
- (2000) Proc. ICSLP 2000 , pp. 369-372
- Hirose, K.¹ Minematsu, N.² Kawanami, H.³

7
- 51449114529
- A style control technique for HMM-based expressive speech synthesis
- and, Sept
- T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis, " IEICE Trans. Inf. & Syst., vol. E90-D, no. 9, pp. 1406-1413, Sept. 2007.
- (2007) IEICE Trans. Inf. & Syst. , vol.E90-D , Issue.9 , pp. 1406-1413
- Nose, T.¹ Yamagishi, J.² Masuko, T.³ Kobayashi, T.⁴

8
- 33646791479
- Prosody analysis and modeling for emotional speech synthesis
- and, March
- D. Jiang, W. Zhang, L. Shen, and L. Cai, "Prosody analysis and modeling for emotional speech synthesis, " Proc. ICASSP 2005, pp. 281-284, March 2005.
- (2005) Proc. ICASSP 2005 , pp. 281-284
- Jiang, D.¹ Zhang, W.² Shen, L.³ Cai, L.⁴

9
- 0037380186
- The role of voice quality in communicating emotion, mood and attitude
- C. Gobl and A. N. Chasaide, "The role of voice quality in communicating emotion, mood and attitude, " Speech Commun., vol. 40, no. 1-2, pp. 189-212, 2003.
- (2003) Speech Commun. , vol.40 , Issue.1-2 , pp. 189-212
- Gobl, C.¹ Chasaide, A.N.²

10
- 34047247202
- Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis
- and, July
- C. H. Wu, C. C. Hsia, T. H. Liu, and J. F. Wang, "Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis, " IEEE Trans. Audio, Speech, and Language Process., vol. 14, no. 4, pp. 1109-1116, July 2006.
- (2006) IEEE Trans. Audio, Speech, and Language Process. , vol.14 , Issue.4 , pp. 1109-1116
- Wu, C.H.¹ Hsia, C.C.² Liu, T.H.³ Wang, J.F.⁴

11
- 33645768204
- A style adaptation technique for speech synthesis using HSMM and suprasegmental features
- and, March
- M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style adaptation technique for speech synthesis using HSMM and suprasegmental features, " IEICE Trans. Inf. & Syst., vol. E89-D, no. 3, pp. 1092-1099, March 2006.
- (2006) IEICE Trans. Inf. & Syst. , vol.E89-D , Issue.3 , pp. 1092-1099
- Tachibana, M.¹ Yamagishi, J.² Masuko, T.³ Kobayashi, T.⁴

12
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, " Comput. Speech Lang., vol. 9, no. 2, pp. 171-185, 1995.
- (1995) Comput. Speech Lang. , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

13
- 33847129573
- Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
- and, Feb
- J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training, " IEICE Trans. Inf. & Syst., vol. E90-D, no. 2, pp. 533-543, Feb. 2007.
- (2007) IEICE Trans. Inf. & Syst. , vol.E90-D , Issue.2 , pp. 533-543
- Yamagishi, J.¹ Kobayashi, T.²

14
- 44449177634
- A hidden semi-Markov model-based speech synthesis system
- and, May
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markov model-based speech synthesis system, " IEICE Trans. Inf. & Syst., vol. E90-D, no. 5, pp. 825-834, May 2007.
- (2007) IEICE Trans. Inf. & Syst. , vol.E90-D , Issue.5 , pp. 825-834
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

15
- 29144493408
- Human walking motion synthesis with desired pace and stride length based on HSMM
- and, Nov
- N. Niwase, J. Yamagishi, and T. Kobayashi, "Human walking motion synthesis with desired pace and stride length based on HSMM, " IEICE Trans. Inf. & Syst., vol. E88-D, no. 11, pp. 2492-2499, Nov. 2005.
- (2005) IEICE Trans. Inf. & Syst. , vol.E88-D , Issue.11 , pp. 2492-2499
- Niwase, N.¹ Yamagishi, J.² Kobayashi, T.³

16
- 0142007308
- A training method of average voice model for HMM-based speech synthesis
- and, Aug
- J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "A training method of average voice model for HMM-based speech synthesis, " IEICE Trans. Fundamentals, vol. E86-A, no. 8, pp. 1956-1963, Aug. 2003.
- (2003) IEICE Trans. Fundamentals , vol.E86-A , Issue.8 , pp. 1956-1963
- Yamagishi, J.¹ Tamura, M.² Masuko, T.³ Tokuda, K.⁴ Kobayashi, T.⁵

17
- 85009080581
- MLLR adaptation for hidden semi-Markov model based speech synthesis
- and, Oct
- J. Yamagishi, T. Masuko, and T. Kobayashi, "MLLR adaptation for hidden semi-Markov model based speech synthesis, " Proc. INTER-SPEECH 2004-ICSLP, pp. 1213-1216, Oct. 2004.
- (2004) Proc. INTER-SPEECH 2004-ICSLP , pp. 1213-1216
- Yamagishi, J.¹ Masuko, T.² Kobayashi, T.³

18
- 34547496746
- Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis
- and, Sept
- Y. Nakano, M. Tachibana, J. Yamagishi, and T. Kobayashi, "Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis, " Proc. INTERSPEECH 2006-ICSLP, pp. 2286-2289, Sept. 2006.
- (2006) Proc. INTERSPEECH 2006-ICSLP , pp. 2286-2289
- Nakano, Y.¹ Tachibana, M.² Yamagishi, J.³ Kobayashi, T.⁴

19
- 0035279111
- A structural Bayes approach to speaker adaptation
- K. Shinoda and C. Lee, "A structural Bayes approach to speaker adaptation, " IEEE Trans. Speech Audio Process., vol. 9, no. 3, pp. 276-287, 2001.
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.3 , pp. 276-287
- Shinoda, K.¹ Lee, C.²

20
- 0036461005
- Structural maximum a posteriori linear regression for fast HMM adaptation
- O. Shiohan, Y. Myrvoll, and C. Lee, "Structural maximum a posteriori linear regression for fast HMM adaptation, " Comput. Speech Lang., vol. 16, no. 3, pp. 5-24, 2002.
- (2002) Comput. Speech Lang. , vol.16 , Issue.3 , pp. 5-24
- Shiohan, O.¹ Myrvoll, Y.² Lee, C.³

21
- 0030189744
- Speaker adaptation using combined transformation and Bayesian methods
- and, July
- V. Digalakis and L. Neumeyer, "Speaker adaptation using combined transformation and Bayesian methods, " IEEE Trans. Speech Audio Process., vol. 4, no. 4, pp. 294-300, July 1996.
- (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.4 , pp. 294-300
- Digalakis, V.¹ Neumeyer, L.²

22
- 34547525896
- Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis
- and, Sept
- K. Ogata, M. Tachibana, J. Yamagishi, and T. Kobayashi, "Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis, " Proc. INTERSPEECH 2006-ICSLP, pp. 1328-1331, Sept. 2006.
- (2006) Proc. INTERSPEECH 2006-ICSLP , pp. 1328-1331
- Ogata, K.¹ Tachibana, M.² Yamagishi, J.³ Kobayashi, T.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.