SCOPUS 정보 검색 플랫폼

6th ISCA Workshop on Speech Synthesis, SSW 2007

Volumn , Issue , 2007, Pages 294-299

The HMM-based Speech Synthesis System (HTS) Version 2.0

(7) Zen, Heiga a Nose, Takashi b Yamagishi, Junichi b,c Sako, Shinji a,d Masuko, Takashi b Black, Alan W e Tokuda, Keiichi a

a NAGOYA INSTITUTE OF TECHNOLOGY (Japan)

b TOKYO INSTITUTE OF TECHNOLOGY (Japan)

c UNIVERSITY OF EDINBURGH (United Kingdom)

d UNIVERSITY OF TOKYO (Japan)

e CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

OPEN SOURCE SOFTWARE; OPEN SYSTEMS; SPEECH SYNTHESIS;

CONTEXT DEPENDENT; HIDDEN MARKOV MODEL-BASED SPEECH SYNTHESIS; HIDDEN-MARKOV MODELS; MODELING SPECTRA; OPEN-SOURCE SOFTWARES; RESEARCH PLATFORMS; SOFTWARE TOOLKITS; SPEECH SYNTHESIS SYSTEM; SPEECH WAVEFORMS; STATISTICAL PARAMETRIC SPEECH SYNTHESIS;

HIDDEN MARKOV MODELS;

EID: 85133720638 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (372)

References (83)

1
- 0342918775
- CHATR: a generic speech synthesis system
- A.W. Black and P. Taylor, “CHATR: a generic speech synthesis system,” in Proc. COLING94, 1994.
- (1994) Proc. COLING94
- Black, A.W.¹ Taylor, P.²

2
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- A. Hunt and A.W. Black, “Unit selection in a concatenative speech synthesis system using a large speech database,” in Proc. ICASSP, 1996, pp. 373–376.
- (1996) Proc. ICASSP , pp. 373-376
- Hunt, A.¹ Black, A.W.²

3
- 0028996983
- Automatic speech synthesizer parameter estimation using HMMs
- R.E. Donovan and P.C. Woodland, “Automatic speech synthesizer parameter estimation using HMMs,” in Proc. ICASSP, 1995, pp. 640–643.
- (1995) Proc. ICASSP , pp. 640-643
- Donovan, R.E.¹ Woodland, P.C.²

4
- 78649277093
- A corpus-based approach to expressive speech synthesis
- E. Eide, A. Aaron, R. Bakis, W. Hamza, M. Picheny, and J. Pitrelli, “A corpus-based approach to expressive speech synthesis,” in Proc. ISCA SSW5, 2004.
- (2004) Proc. ISCA SSW5
- Eide, E.¹ Aaron, A.² Bakis, R.³ Hamza, W.⁴ Picheny, M.⁵ Pitrelli, J.⁶

5
- 85006631929
- Unit selection and emotional speech
- A.W. Black, “Unit selection and emotional speech,” in Proc. Eurospeech, 2003, pp. 1649–1652.
- (2003) Proc. Eurospeech , pp. 1649-1652
- Black, A.W.¹

6
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis,” in Proc. Eurospeech, 1999, pp. 2347–2350.
- (1999) Proc. Eurospeech , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

7
- 33846405723
- Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005
- Jan
- H. Zen, T. Toda, M. Nakamura, and K. Tokuda, “Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005,” IEICE Trans. Inf. & Syst., vol. E90-D, no. 1, pp. 325–333, Jan. 2007.
- (2007) IEICE Trans. Inf. & Syst , vol.E90-D , Issue.1 , pp. 325-333
- Zen, H.¹ Toda, T.² Nakamura, M.³ Tokuda, K.⁴

8
- 67650851754
- USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method
- Z.-H. Ling, Y.-J. Wu, Y.-P. Wang, L. Qin, and R.-H. Wang, “USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method,” in Blizzard Challenge Workshop, 2006.
- (2006) Blizzard Challenge Workshop
- Ling, Z.-H.¹ Wu, Y.-J.² Wang, Y.-P.³ Qin, L.⁴ Wang, R.-H.⁵

9
- 34547526960
- Statistical parametric speech synthesis
- A.W. Black, H. Zen, and K. Tokuda, “Statistical parametric speech synthesis,” in Proc. ICASSP, 2007, pp. 1229–1232.
- (2007) Proc. ICASSP , pp. 1229-1232
- Black, A.W.¹ Zen, H.² Tokuda, K.³

10
- 34547514452
- A novel HMM-based TTS system using both continuous HMMs and discrete HMMs
- J. Yu, M. Zhang, J. Tao, and X. Wang, “A novel HMM-based TTS system using both continuous HMMs and discrete HMMs,” in Proc. ICASSP, 2007, pp. 709–712.
- (2007) Proc. ICASSP , pp. 709-712
- Yu, J.¹ Zhang, M.² Tao, J.³ Wang, X.⁴

11
- 85016140477
- An adaptive algorithm for mel-cepstral analysis of speech
- T. Fukada, K. Tokuda, Kobayashi T., and S. Imai, “An adaptive algorithm for mel-cepstral analysis of speech,” in Proc. ICASSP, 1992, pp. 137–140.
- (1992) Proc. ICASSP , pp. 137-140
- Fukada, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

12
- 0032678076
- Hidden Markov models based on multi-space probability distribution for pitch pattern modeling
- K. Tokuda, T. Masuko, N. Miyazaki, and T. Kobayashi, “Hidden Markov models based on multi-space probability distribution for pitch pattern modeling,” in Proc. ICASSP, 1999, pp. 229–232.
- (1999) Proc. ICASSP , pp. 229-232
- Tokuda, K.¹ Masuko, T.² Miyazaki, N.³ Kobayashi, T.⁴

13
- 85093445139
- Duration modeling for HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Duration modeling for HMM-based speech synthesis,” in Proc. ICSLP, 1998, pp. 29–32.
- (1998) Proc. ICSLP , pp. 29-32
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

14
- 33947110905
- State duration modeling for HMM-based speech synthesis
- H. Zen, T. Masuko, T. Yoshimura, K. Tokuda, T. Kobayashi, and T. Kitamura, “State duration modeling for HMM-based speech synthesis,” IEICE Trans. on Inf. & Syst., vol. E90-D, no. 3, pp. 692–693, 2007.
- (2007) IEICE Trans. on Inf. & Syst , vol.E90-D , Issue.3 , pp. 692-693
- Zen, H.¹ Masuko, T.² Yoshimura, T.³ Tokuda, K.⁴ Kobayashi, T.⁵ Kitamura, T.⁶

15
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, “Speech parameter generation algorithms for HMM-based speech synthesis,” in Proc. ICASSP, 2000, pp. 1315–1318.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

16
- 0020596154
- Cepstral analysis synthesis on the mel frequency scale
- S. Imai, “Cepstral analysis synthesis on the mel frequency scale,” in Proc. ICASSP, 1983, pp. 93–96.
- (1983) Proc. ICASSP , pp. 93-96
- Imai, S.¹

17
- 0030696416
- Voice characteristics conversion for HMM-based speech synthesis system
- T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, “Voice characteristics conversion for HMM-based speech synthesis system,” in Proc. ICASSP, 1997, pp. 1611–1614.
- (1997) Proc. ICASSP , pp. 1611-1614
- Masuko, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

18
- 0034842740
- Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
- M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, “Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR,” in Proc. ICASSP, 2001, pp. 805–808.
- (2001) Proc. ICASSP , pp. 805-808
- Tamura, M.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴

19
- 85135145847
- Speaker interpolation in HMM-based speech synthesis system
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Speaker interpolation in HMM-based speech synthesis system,” in Proc. Eurospeech, 1997, pp. 2523–2526.
- (1997) Proc. Eurospeech , pp. 2523-2526
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

20
- 29144475179
- Speech synthesis with various emotional expressions and speaking styles by style interpolationand morphing
- M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, “Speech synthesis with various emotional expressions and speaking styles by style interpolationand morphing,” IEICE Trans. Inf. & Syst., vol. E88-D, no. 11, pp. 2484–2491, 2005.
- (2005) IEICE Trans. Inf. & Syst , vol.E88-D , Issue.11 , pp. 2484-2491
- Tachibana, M.¹ Yamagishi, J.² Masuko, T.³ Kobayashi, T.⁴

21
- 85009257840
- Eigenvoices for HMM-based speech synthesis
- K. Shichiri, A. Sawabe, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Eigenvoices for HMM-based speech synthesis,” in Proc. ICSLP, 2002, pp. 1269–1272.
- (2002) Proc. ICSLP , pp. 1269-1272
- Shichiri, K.¹ Sawabe, A.² Tokuda, K.³ Masuko, T.⁴ Kobayashi, T.⁵ Kitamura, T.⁶

22
- 34547529063
- A style control technique for speech synthesis using multiple regression HSMM
- T. Nose, J. Yamagishi, and T. Kobayashi, “A style control technique for speech synthesis using multiple regression HSMM,” in Proc. Interspeech, 2006, pp. 1324–1327.
- (2006) Proc. Interspeech , pp. 1324-1327
- Nose, T.¹ Yamagishi, J.² Kobayashi, T.³

23
- 79952258981
- K. Tokuda, H. Zen, J. Yamagishi, T. Masuko, S. Sako, A.W. Black, and T. Nose, “The HMM-based speech synthesis system (HTS),” http://hts.sp.nitech.ac.jp/.
- The HMM-based speech synthesis system (HTS)
- Tokuda, K.¹ Zen, H.² Yamagishi, J.³ Masuko, T.⁴ Sako, S.⁵ Black, A.W.⁶ Nose, T.⁷

24
- 0003483593
- S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X.-Y. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The Hidden Markov Model Toolkit (HTK) version 3.4, 2006, http://htk.eng.cam.ac.uk/.
- (2006) The Hidden Markov Model Toolkit (HTK) version 3.4
- Young, S.¹ Evermann, G.² Gales, M.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.-Y.⁶ Moore, G.⁷ Odell, J.⁸ Ollason, D.⁹ Povey, D.¹⁰ Valtchev, V.¹¹ Woodland, P.¹²

25
- 85135145174
- Acoustic modeling based on the MDL criterion for speech recognition
- K. Shinoda and T. Watanabe, “Acoustic modeling based on the MDL criterion for speech recognition,” in Proc. Eurospeech, 1997, pp. 99–102.
- (1997) Proc. Eurospeech , pp. 99-102
- Shinoda, K.¹ Watanabe, T.²

26
- 0003571407
- A.W. Black, P. Taylor, and R. Caley, “The festival speech synthesis system,” http://www.festvox.org/festival/.
- The festival speech synthesis system
- Black, A.W.¹ Taylor, P.² Caley, R.³

27
- 33646773080
- Tech. Rep. CMU-LTI-03-177, Carnegie Mellon University
- J. Kominek and A.W. Black, “CMU ARCTIC databases for speech synthesis,” Tech. Rep. CMU-LTI-03-177, Carnegie Mellon University, 2003.
- (2003) CMU ARCTIC databases for speech synthesis
- Kominek, J.¹ Black, A.W.²

28
- 78049361102
- Incorporation of mixed excitation model and postfilter into HMM-based text-to-speech synthesis
- Aug
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Incorporation of mixed excitation model and postfilter into HMM-based text-to-speech synthesis,” IEICE Trans. Inf. & Syst. (Japanese Edition), vol. J87-D-II, no. 8, pp. 1563–1571, Aug. 2004.
- (2004) IEICE Trans. Inf. & Syst. (Japanese Edition) , vol.J87-D-II , Issue.8 , pp. 1563-1571
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

29
- 77950581127
- Galatea Project, “Galatea – An open-source toolkit for anthropomorphic spoken dialogue agent,” http://hil.t.u-tokyo.ac.jp/galatea/.
- Galatea – An open-source toolkit for anthropomorphic spoken dialogue agent

30
- 60649102582
- XIMERA: A concatenative speech synthesis system with large scale corpora
- Dec
- H. Kawai, T. Toda, J. Yamagishi, T. Hirai, J. Ni, T. Nishizawa, M. Tsuzaki, and K. Tokuda, “XIMERA: A concatenative speech synthesis system with large scale corpora,” IEICE Trans. Inf. & Syst. (Japanese Edition), vol. J89-D, no. 12, pp. 2688–2698, Dec. 2006.
- (2006) IEICE Trans. Inf. & Syst. (Japanese Edition) , vol.J89-D , Issue.12 , pp. 2688-2698
- Kawai, H.¹ Toda, T.² Yamagishi, J.³ Hirai, T.⁴ Ni, J.⁵ Nishizawa, T.⁶ Tsuzaki, M.⁷ Tokuda, K.⁸

31
- 53049084992
- An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements
- S. Krstulovic, A. Hunecke, and M. Schroeder, “An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements,” in Proc. of Interspeech, 2007.
- (2007) Proc. of Interspeech
- Krstulovic, S.¹ Hunecke, A.² Schroeder, M.³

32
- 0142247093
- The German text-to-speech synthesis system MARY: A tool for research, development and teaching
- M. Schröder and J. Trouvain, “The German text-to-speech synthesis system MARY: A tool for research, development and teaching,” InternationalJournal of Speech Technology, vol. 6, pp. 365–377, 2003.
- (2003) InternationalJournal of Speech Technology , vol.6 , pp. 365-377
- Schröder, M.¹ Trouvain, J.²

33
- 48549095974
- HMM-basedtrainable speech synthesis for Chinese
- Y.-J. Wu and R.H. Wang, “HMM-basedtrainable speech synthesis for Chinese,” Journal of Chinese InformationProcessing, vol. 20, no. 4, pp. 75–81, 2006.
- (2006) Journal of Chinese InformationProcessing , vol.20 , Issue.4 , pp. 75-81
- Wu, Y.-J.¹ Wang, R.H.²

34
- 38149113842
- An HMM-based Mandarin Chinese text-to-speech system
- Y. Qian, F. Soong, Y. Chen, and M. Chu, “An HMM-based Mandarin Chinese text-to-speech system,” in Proc. of ISCSLP, 2006.
- (2006) Proc. of ISCSLP
- Qian, Y.¹ Soong, F.² Chen, Y.³ Chu, M.⁴

35
- 33645755910
- Implementationand evaluation of an HMM-based Korean speech synthesis system
- S.-J. Kim, J.-J. Kim, and M.-S. Hahn, “Implementationand evaluation of an HMM-based Korean speech synthesis system,” IEICE Trans. Inf. & Syst., vol. E89-D, pp. 1116–1119, 2006.
- (2006) IEICE Trans. Inf. & Syst , vol.E89-D , pp. 1116-1119
- Kim, S.-J.¹ Kim, J.-J.² Hahn, M.-S.³

36
- 56149086860
- Low resource HMM-based speech synthesis applied to German
- C. Weiss, R. Maia, K. Tokuda, and W. Hess, “Low resource HMM-based speech synthesis applied to German,” in ESSP, 2005.
- (2005) ESSP
- Weiss, C.¹ Maia, R.² Tokuda, K.³ Hess, W.⁴

37
- 33745185536
- HMM-based European Portuguese speech synthesis
- M. Barros, R. Maia, K. Tokuda, D. Freitas, and F. Resende Jr., “HMM-based European Portuguese speech synthesis,” in Interspeech, 2005, pp. 2581–2584.
- (2005) Interspeech , pp. 2581-2584
- Barros, M.¹ Maia, R.² Tokuda, K.³ Freitas, D.⁴ Resende, F.⁵

38
- 34547508131
- Master thesis, Royal Institute of Technology (KTH)
- A. Lundgren, An HMM-based text-to-speech system applied to Swedish, Master thesis, Royal Institute of Technology (KTH), 2005.
- (2005) An HMM-based text-to-speech system applied to Swedish
- Lundgren, A.¹

39
- 34547545735
- Master thesis, Helsinki University of Technology
- T. Ojala, Auditory quality evaluation of present Finnish text-to-speech systems, Master thesis, Helsinki University of Technology, 2006.
- (2006) Auditory quality evaluation of present Finnish text-to-speech systems
- Ojala, T.¹

40
- 34547536808
- Developing a Finnish concept-to-speech system
- M. Vainio, A. Suni, and P. Sirjola, “Developing a Finnish concept-to-speech system,” in 2nd Baltic conference on HLT, 2005, pp. 201–206.
- (2005) 2nd Baltic conference on HLT , pp. 201-206
- Vainio, M.¹ Suni, A.² Sirjola, P.³

41
- 22944466413
- Evaluation of the Slovenian HMM-based speech synthesis system
- B. Vesnicer and F. Mihelic, “Evaluation of the Slovenian HMM-based speech synthesis system,” in TSD, 2004, pp. 513–520.
- (2004) TSD , pp. 513-520
- Vesnicer, B.¹ Mihelic, F.²

42
- 34547505602
- Croatian HMM-based speech synthesis
- S. Martincic-Ipsic and I. Ipsic, “Croatian HMM-based speech synthesis,” Journal of Computing and Information Technology, vol. 14, no. 4, pp. 307–313, 2006.
- (2006) Journal of Computing and Information Technology , vol.14 , Issue.4 , pp. 307-313
- Martincic-Ipsic, S.¹ Ipsic, I.²

43
- 34547542349
- Improving Arabic HMM based speech synthesis quality
- O. Abdel-Hamid,S. Abdou, and M. Rashwan, “Improving Arabic HMM based speech synthesis quality,” in Interspeech, 2006, pp. 1332–1335.
- (2006) Interspeech , pp. 1332-1335
- Abdel-Hamid, O.¹ Abdou, S.² Rashwan, M.³

44
- 33646769932
- Polyglot synthesis using a mixture of monolingual corpora
- J. Latorre, K. Iwano, and S. Furui, “Polyglot synthesis using a mixture of monolingual corpora,” in ICASSP, 2005, vol. 1, pp. 1–4.
- (2005) ICASSP , vol.1 , pp. 1-4
- Latorre, J.¹ Iwano, K.² Furui, S.³

45
- 85133704121
- HMM-based Spanish speech synthesis using CBR as F0 estimator
- X. Gonzalvo, I. Iriondo, J. Socor, F. Alas, and C. Monzo, “HMM-based Spanish speech synthesis using CBR as F0 estimator,” in ITRW on NOLISP, 2007.
- (2007) ITRW on NOLISP
- Gonzalvo, X.¹ Iriondo, I.² Socor, J.³ Alas, F.⁴ Monzo, C.⁵

46
- 51449114333
- Implementationand evaluation of an HMM-based Thai speech synthesis system
- S. Chomphan and T. Kobayashi, “Implementationand evaluation of an HMM-based Thai speech synthesis system,” in Proc. of Interspeech, 2007.
- (2007) Proc. of Interspeech
- Chomphan, S.¹ Kobayashi, T.²

47
- 44949126431
- A constrained Baum-Welch algorithm for improved phoneme segmentation and efficient training
- D. Huggins-Daines and A. Rudnicky, “A constrained Baum-Welch algorithm for improved phoneme segmentation and efficient training,” in Proc. of Interspeech, 2006, pp. 1205–1208.
- (2006) Proc. of Interspeech , pp. 1205-1208
- Huggins-Daines, D.¹ Rudnicky, A.²

48
- 77949388548
- Reformulating the HMM as a trajectory model
- K. Tokuda, H. Zen, and T. Kitamura, “Reformulating the HMM as a trajectory model,” in Proc. Beyond HMM – Workshop on statisticalmodeling approach for speech recognition, 2004.
- (2004) Proc. Beyond HMM – Workshop on statisticalmodeling approach for speech recognition
- Tokuda, K.¹ Zen, H.² Kitamura, T.³

49
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M.J.F. Gales, “Maximum likelihood linear transformations for HMM-based speech recognition,” Computer Speech & Language, vol. 12, no. 2, pp. 75–98, 1998.
- (1998) Computer Speech & Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

50
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- M.J.F. Gales, “Semi-tied covariance matrices for hidden Markov models,” IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 272–281, 1999.
- (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
- Gales, M.J.F.¹

51
- 0036475982
- Maximum likelihood multiple projection schemes for hidden Markov models
- M.J.F. Gales, “Maximum likelihood multiple projection schemes for hidden Markov models,” IEEE Trans. Speech & Audio Process., vol. 10, no. 2, pp. 37–47, 2002.
- (2002) IEEE Trans. Speech & Audio Process , vol.10 , Issue.2 , pp. 37-47
- Gales, M.J.F.¹

52
- 4544291748
- Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis
- J. Yamagishi, M. Tachibana, T. Masuko, and T. Kobayashi, “Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis,” in Proc. ICASSP, 2004, pp. 5–8.
- (2004) Proc. ICASSP , pp. 5-8
- Yamagishi, J.¹ Tachibana, M.² Masuko, T.³ Kobayashi, T.⁴

53
- 33645768204
- A style adaptation technique for speech synthesis using HSMM and suprasegmental features
- M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, “A style adaptation technique for speech synthesis using HSMM and suprasegmental features,” IEICE Trans. Inf. & Syst., vol. E89-D, no. 3, pp. 1092–1099, 2006.
- (2006) IEICE Trans. Inf. & Syst , vol.E89-D , Issue.3 , pp. 1092-1099
- Tachibana, M.¹ Yamagishi, J.² Masuko, T.³ Kobayashi, T.⁴

54
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- J.L. Gauvain and C.-H. Lee, “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,” IEEE Trans. on Speech & Audio Process., vol. 2, no. 2, pp. 291–298, 1994.
- (1994) IEEE Trans. on Speech & Audio Process , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.-H.²

55
- 33847129573
- Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
- J. Yamagishi and T. Kobayashi, “Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training,” IEICE Trans. Inf. & Syst., vol. E90-D, no. 2, pp. 533–543, 2007.
- (2007) IEICE Trans. Inf. & Syst , vol.E90-D , Issue.2 , pp. 533-543
- Yamagishi, J.¹ Kobayashi, T.²

56
- 33947669452
- HSMM-based model adaptation algorithms for average-voice-based speech synthesis
- J. Yamagishi, K. Ogata, Y. Nakano, J. Isogai, and T. Kobayashi, “HSMM-based model adaptation algorithms for average-voice-based speech synthesis,” in Proc. ICASSP, 2006, pp. 77–80.
- (2006) Proc. ICASSP , pp. 77-80
- Yamagishi, J.¹ Ogata, K.² Nakano, Y.³ Isogai, J.⁴ Kobayashi, T.⁵

57
- 33846463597
- Ph.D. thesis, Tokyo Instituteof Technology
- J. Yamagishi, Average-Voice-BasedSpeech Synthesis, Ph.D. thesis, Tokyo Instituteof Technology, 2006.
- (2006) Average-Voice-BasedSpeech Synthesis
- Yamagishi, J.¹

58
- 38549096029
- A speech parameter generationalgorithm considering global variance for HMM-based speech synthesis
- T. Toda and K. Tokuda, “A speech parameter generationalgorithm considering global variance for HMM-based speech synthesis,” IEICE Trans. Inf. & Syst., vol. E90-D, no. 5, pp. 816–824, 2007.
- (2007) IEICE Trans. Inf. & Syst , vol.E90-D , Issue.5 , pp. 816-824
- Toda, T.¹ Tokuda, K.²

59
- 0003940203
- Tech. Rep. CUED/F-INFENG/TR263, Cambridge University
- M.J.F. Gales, “The generation and use of regression class trees for MLLR adaptation,” Tech. Rep. CUED/F-INFENG/TR263, Cambridge University, 1996.
- (1996) The generation and use of regression class trees for MLLR adaptation
- Gales, M.J.F.¹

60
- 84966341178
- The impact of speech recognition on speech synthesis
- CD-ROM proceeding
- M. Ostendorf and I. Bulyko, “The impact of speech recognition on speech synthesis,” in Proc. the IEEE Workshop on Speech Synthesis, 2002, CD-ROM proceeding.
- (2002) Proc. the IEEE Workshop on Speech Synthesis
- Ostendorf, M.¹ Bulyko, I.²

61
- 77950550320
- Motion generation for Japanese finger language based on hidden Markov models
- (in Japanese)
- K. Mori, Y. Nankaku, C. Miyajima, K. Tokuda, and T. Kitamura, “Motion generation for Japanese finger language based on hidden Markov models,” in Proc. FIT, 2005, vol. 3, pp. 569–570, (in Japanese).
- (2005) Proc. FIT , vol.3 , pp. 569-570
- Mori, K.¹ Nankaku, Y.² Miyajima, C.³ Tokuda, K.⁴ Kitamura, T.⁵

62
- 29144493408
- Human walking motion synthesis with desired pace and stride length based on HSMM
- N. Niwase, J. Yamagishi, and T. Kobayashi, “Human walking motion synthesis with desired pace and stride length based on HSMM,” IEICE Trans. Inf. & Syst., vol. E88-D, no. 11, pp. 2492–2499, 2005.
- (2005) IEICE Trans. Inf. & Syst , vol.E88-D , Issue.11 , pp. 2492-2499
- Niwase, N.¹ Yamagishi, J.² Kobayashi, T.³

63
- 77950584066
- Speech driven head motion synthesis based on a trajectory model
- (submitted)
- G. Hofer, H. Shimodaira, and J. Yamagishi, “Speech driven head motion synthesis based on a trajectory model,” in Proc. SIG-GRAPH, 2007, (submitted).
- (2007) Proc. SIG-GRAPH
- Hofer, G.¹ Shimodaira, H.² Yamagishi, J.³

64
- 85133661753
- TDA: a new trainable trajectory formation system for facial animation
- O. Govokhina, G. Bailly, G. Breton, and P. Bagshaw, “TDA: a new trainable trajectory formation system for facial animation,” in Proc. Interspeech, 2006, pp. 1274–1247.
- (2006) Proc. Interspeech , pp. 1274-1247
- Govokhina, O.¹ Bailly, G.² Breton, G.³ Bagshaw, P.⁴

65
- 84919370414
- Text-to-audio-visualspeechsynthesisbasedon parametergenerationfrom HMM
- M. Tamura, S. Kondo, T. Masuko, and T. Kobayashi, “Text-to-audio-visualspeechsynthesisbasedon parametergenerationfrom HMM,” in Proc. Eurospeech, 1999, pp. 959–962.
- (1999) Proc. Eurospeech , pp. 959-962
- Tamura, M.¹ Kondo, S.² Masuko, T.³ Kobayashi, T.⁴

66
- 85009089413
- HMM-based text-to-audio-visualspeech synthesis
- S. Sako, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “HMM-based text-to-audio-visualspeech synthesis,” in Proc. ICSLP, 2000, pp. 25–28.
- (2000) Proc. ICSLP , pp. 25-28
- Sako, S.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

67
- 77950561562
- Audio-visuallarge vocabulary continuous speech recognition based on early integration
- (in Japanese)
- T. Ishikawa, Y. Sawada, H. Zen, Y. Nankaku, C. Miyajima, K. Tokuda, and T. Kitamura, “Audio-visuallarge vocabulary continuous speech recognition based on early integration,” in Proc. FIT, 2002, pp. 203–204, (in Japanese).
- (2002) Proc. FIT , pp. 203-204
- Ishikawa, T.¹ Sawada, Y.² Zen, H.³ Nankaku, Y.⁴ Miyajima, C.⁵ Tokuda, K.⁶ Kitamura, T.⁷

68
- 44949185845
- A trajectory mixture density network for the acoustic-articulatoryinversion mapping
- K. Richmond, “A trajectory mixture density network for the acoustic-articulatoryinversion mapping,” in Proc. of Interspeech, 2006, pp. 577–580.
- (2006) Proc. of Interspeech , pp. 577-580
- Richmond, K.¹

69
- 68349112115
- Accent type recognition for automatic prosodic labeling
- (in Japanese)
- K. Emoto, H. Zen, K. Tokuda, and T. Kitamura, “Accent type recognition for automatic prosodic labeling,” in Proc. Autumn Meeting of ASJ, 2003, vol. I, pp. 225–226, (in Japanese).
- (2003) Proc. Autumn Meeting of ASJ , vol.I , pp. 225-226
- Emoto, K.¹ Zen, H.² Tokuda, K.³ Kitamura, T.⁴

70
- 44949211222
- A multi-space distribution (MSD) approach to speech recognition of tonal languages
- H.-L. Wang, Y. Qian, F.K. Soong, J.-L. Zhou, and J.-Q. Han, “A multi-space distribution (MSD) approach to speech recognition of tonal languages,” in Proc. of Interspeech, 2006, pp. 125–128.
- (2006) Proc. of Interspeech , pp. 125-128
- Wang, H.-L.¹ Qian, Y.² Soong, F.K.³ Zhou, J.-L.⁴ Han, J.-Q.⁵

71
- 0141702206
- Improving the performance of HMM-based very low bitrate speech coding
- T. Hoshiya, S. Sako, H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Improving the performance of HMM-based very low bitrate speech coding,” in Proc. ICASSP, 2003, vol. 1, pp. 800–803.
- (2003) Proc. ICASSP , vol.1 , pp. 800-803
- Hoshiya, T.¹ Sako, S.² Zen, H.³ Tokuda, K.⁴ Masuko, T.⁵ Kobayashi, T.⁶ Kitamura, T.⁷

72
- 77950562268
- An acoustic model adaptationusing HMM-based speech synthesis
- K. Tanaka, S. Kuroiwa, S. Tsuge, and F. Ren, “An acoustic model adaptationusing HMM-based speech synthesis,” in Proc. NLPKE, 2003, vol. 1, pp. 368–373.
- (2003) Proc. NLPKE , vol.1 , pp. 368-373
- Tanaka, K.¹ Kuroiwa, S.² Tsuge, S.³ Ren, F.⁴

73
- 77950587589
- An approach for training acoustic models based on the vocabulary of the target speech recognition task
- (in Japanese)
- M. Ishihara, C. Miyajima, N. Kitaoka, K. Itou, and K. Takeda, “An approach for training acoustic models based on the vocabulary of the target speech recognition task,” in Proc. Spring Meeting of ASJ, 2007, pp. 153–154, (in Japanese).
- (2007) Proc. Spring Meeting of ASJ , pp. 153-154
- Ishihara, M.¹ Miyajima, C.² Kitaoka, N.³ Itou, K.⁴ Takeda, K.⁵

74
- 77950587361
- An evaluation method of ASR performance by HMM-based speech synthesis
- (in Japanese)
- R. Terashima, T. Yoshimura, T. Wakita, K. Tokuda, and T. Kitamura, “An evaluation method of ASR performance by HMM-based speech synthesis,” in Proc. Spring Meeting of ASJ, 2003, pp. 159–160, (in Japanese).
- (2003) Proc. Spring Meeting of ASJ , pp. 159-160
- Terashima, R.¹ Yoshimura, T.² Wakita, T.³ Tokuda, K.⁴ Kitamura, T.⁵

75
- 51149114615
- A MSD-HMM approach to pen trajectory modeling for online handwriting recognition
- L. Ma, Y.-J. Wu, P. Liu, and F. Soong, “A MSD-HMM approach to pen trajectory modeling for online handwriting recognition,” in Proc. ICDAR, 2007.
- (2007) Proc. ICDAR
- Ma, L.¹ Wu, Y.-J.² Liu, P.³ Soong, F.⁴

76
- 44449177634
- A hidden semi-Markov model-based speech synthesis system
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “A hidden semi-Markov model-based speech synthesis system,” IEICE Trans. Inf. & Syst., vol. E90-D, no. 5, pp. 825–834, 2007.
- (2007) IEICE Trans. Inf. & Syst , vol.E90-D , Issue.5 , pp. 825-834
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

77
- 67650787485
- A Bayesian approach to HMM-based speech synthesis
- (in Japanese)
- Y. Nankaku, H. Zen, K. Tokuda, T. Kitamura, and T. Masuko, “A Bayesian approach to HMM-based speech synthesis,” in Tech. rep. of IEICE, 2003, vol. 103, pp. 19–24, (in Japanese).
- (2003) Tech. rep. of IEICE , vol.103 , pp. 19-24
- Nankaku, Y.¹ Zen, H.² Tokuda, K.³ Kitamura, T.⁴ Masuko, T.⁵

78
- 33749573927
- Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
- H. Zen, K. Tokuda, and T. Kitamura, “Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences,” Computer Speech & Language, vol. 21, no. 1, pp. 153–173, 2006.
- (2006) Computer Speech & Language , vol.21 , Issue.1 , pp. 153-173
- Zen, H.¹ Tokuda, K.² Kitamura, T.³

79
- 0038042801
- A context clustering technique for average voice models
- J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, “A context clustering technique for average voice models,” IEICE Trans. Inf. & Syst., vol. E86-D, no. 3, pp. 534–542, 2003.
- (2003) IEICE Trans. Inf. & Syst , vol.E86-D , Issue.3 , pp. 534-542
- Yamagishi, J.¹ Tamura, M.² Masuko, T.³ Tokuda, K.⁴ Kobayashi, T.⁵

80
- 33745214429
- Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis
- J. Isogai, J. Yamagishi, and T. Kobayashi, “Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis,” in Proc. Interspeech, 2005, pp. 2597–2600.
- (2005) Proc. Interspeech , pp. 2597-2600
- Isogai, J.¹ Yamagishi, J.² Kobayashi, T.³

81
- 34547496746
- Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis
- Y. Nakano, M. Tachibana, J. Yamagishi, and T. Kobayashi, “Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis,” in Proc. Interspeech, 2006, pp. 2286–2289.
- (2006) Proc. Interspeech , pp. 2286-2289
- Nakano, Y.¹ Tachibana, M.² Yamagishi, J.³ Kobayashi, T.⁴

82
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A.de Cheveigné, “Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds,” Speech Communication, vol. 27, pp. 187–207, 1999.
- (1999) Speech Communication , vol.27 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² de Cheveigné, A.³

83
- 34547552746
- The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006
- H. Zen, T. Toda, and K. Tokuda, “The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006,” in Blizzard Challenge Workshop, 2006.
- (2006) Blizzard Challenge Workshop
- Zen, H.¹ Toda, T.² Tokuda, K.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.