SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 17, Issue 1, 2009, Pages 66-83

Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm

(5) Yamagishi, Junichi a Kobayashi, Takao b Nakano, Yuji b Ogata, Katsumi b Isogai, Juri b

a UNIVERSITY OF EDINBURGH (United Kingdom)

b TOKYO INSTITUTE OF TECHNOLOGY (Japan)

Author keywords

Average voice; Hidden Markov model (HMM) based speech synthesis; Speaker adaptation; Speech synthesis; Voice conversion

Indexed keywords

ADAPTATION ALGORITHMS; AVERAGE VOICE; COVARIANCE MATRICES; HIDDEN MARKOV MODEL (HMM)-BASED SPEECH SYNTHESIS; HMM-BASED SPEECH SYNTHESIS; INDEPENDENT MODEL; LINEAR REGRESSION ALGORITHMS; MAP ADAPTATION; MAXIMUM A POSTERIORI; MEAN VECTOR; MODEL CONSTRUCTION; OBJECTIVE EVALUATION; PIECEWISE LINEAR REGRESSION; ROBUST ESTIMATION; SIMULTANEOUS USE; SPEAKER ADAPTATION; SPEECH SYNTHESIS SYSTEM; TRAINING DATA; TRANSFORM FUNCTION; VOICE CONVERSION;

COVARIANCE MATRIX; ESTIMATION; HIDDEN MARKOV MODELS; LINEAR REGRESSION; MATHEMATICAL TRANSFORMATIONS; MOBILE TELECOMMUNICATION SYSTEMS; PIECEWISE LINEAR TECHNIQUES; SPEECH RECOGNITION; SPEECH SYNTHESIS;

ALGORITHMS;

EID: 67650854725 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2008.2006647 Document Type: Article

Times cited : (311)

References (67)

1
- 84966398940
- Optimising selection of units from speech database for concatenative synthesis
- Sep.
- A. Black and N. Cambpbell, "Optimising selection of units from speech database for concatenative synthesis," in Proc. EUROSPEECH'95, Sep. 1995, pp. 581-584.
- (1995) Proc. EUROSPEECH'95 , pp. 581-584
- Black, A.¹ Cambpbell, N.²

2
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- May
- A. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP'96, May 1996, pp. 373-376.
- (1996) Proc. ICASSP'96 , pp. 373-376
- Hunt, A.¹ Black, A.²

3
- 0032651722
- A hidden Markov-model-based trainable speech synthesizer
- R. Donovan and P. Woodland, "A hidden Markov-model-based trainable speech synthesizer," Comput. Speech Lang., vol.13, no.3, pp. 223-241, 1999.
- (1999) Comput. Speech Lang. , vol.13 , Issue.3 , pp. 223-241
- Donovan, R.¹ Woodland, P.²

4
- 85001632375
- Corpus-based techniques in the AT&T NEXTGEN synthesis system
- Oct.
- A. Syrdal, C. Wightman, A. Conkie, Y. Stylianou, M. Beutnagel, J. Schroeter, V. Storm, K. Lee, and M. Makashay, "Corpus-based techniques in the AT&T NEXTGEN synthesis system," in Proc. ICSLP'00, Oct. 2000, pp. 411-416.
- (2000) Proc. ICSLP'00 , pp. 411-416
- Syrdal, A.¹ Wightman, C.² Conkie, A.³ Stylianou, Y.⁴ Beutnagel, M.⁵ Schroeter, J.⁶ Storm, V.⁷ Lee, K.⁸ Makashay, M.⁹

5
- 85006631929
- Unit selection and emotional speech
- Sep.
- A. Black, "Unit selection and emotional speech," in Proc. Eurospeech' 03, Sep. 2003, pp. 1649-1652.
- (2003) Proc. Eurospeech'03 , pp. 1649-1652
- Black, A.¹

6
- 0028996993
- Speech parameter generation from HMM using dynamic features
- May
- K. Tokuda, T. Kobayashi, and S. Imai, "Speech parameter generation from HMM using dynamic features," in Proc. ICASSP'95, May 1995, pp. 660-663.
- (1995) Proc. ICASSP'95 , pp. 660-663
- Tokuda, K.¹ Kobayashi, T.² Imai, S.³

7
- 0038582234
- An algorithm for speech parameter generation from HMM using dynamic features
- Mar.
- K. Tokuda, T. Masuko, T. Kobayashi, and S. Imai, "An algorithm for speech parameter generation from HMM using dynamic features," (in Japanese) J. Acoust. Soc. Jpn., vol.53, no.3, pp. 192-200, Mar. 1997.
- (1997) J. Acoust. Soc. Jpn. (in Japanese) , vol.53 , Issue.3 , pp. 192-200
- Tokuda, K.¹ Masuko, T.² Kobayashi, T.³ Imai, S.⁴

8
- 0029725605
- Speech synthesis using HMMs with dynamic features
- May
- T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Speech synthesis using HMMs with dynamic features," in Proc. ICASSP'96, May 1996, pp. 389-392.
- (1996) Proc. ICASSP'96 , pp. 389-392
- Masuko, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

9
- 0002025578
- HMM-based speech synthesis using dynamic features
- Dec.
- T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "HMM-based speech synthesis using dynamic features," (in Japanese) IEICE Trans., vol.J79-D-II, no.12, pp. 2184-2190, Dec. 1996.
- (1996) IEICE Trans. (in Japanese) , vol.J79-D-II , Issue.12 , pp. 2184-2190
- Masuko, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

10
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- Sep.
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. Eurospeech'99, Sep. 1999, pp. 2374-12350
- (1999) Proc. Eurospeech'99 , pp. 2374-12350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

11
- 7044242284
- Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis
- Nov.
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis," (in Japanese) IEICE Trans., vol.J83-D-II, no.11, pp. 2099-2107, Nov. 2000.
- (2000) IEICE Trans. (in Japanese) , vol.J83-D-II , Issue.11 , pp. 2099-2107
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

12
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP'00, Jun. 2000, pp. 1315-1318. (Pubitemid 30956411)
- (2000) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.3 , pp. 1315-1318
- Tokuda Keiichi¹ Yoshimura Takayoshi² Masuko Takashi³ Kobayashi Takao⁴ Kitamura Tadashi⁵

13
- 24144497811
- Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis
- Mar.
- J.Yamagishi, K. Onishi, T. Masuko, and T.Kobayashi, "Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol.E88-D, no.3, pp. 503-509, Mar. 2005.
- (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 503-509
- Yamagishi, J.¹ Onishi, K.² Masuko, T.³ Kobayashi, T.⁴

14
- 33645768204
- A style adaptation technique for speech synthesis using HSMM and suprasegmental features
- Mar.
- M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style adaptation technique for speech synthesis using HSMM and suprasegmental features," IEICE Trans. Inf. Syst., vol.E89-D, no.3, pp. 1092-1099, Mar. 2006.
- (2006) IEICE Trans. Inf. Syst. , vol.E89-D , Issue.3 , pp. 1092-1099
- Tachibana, M.¹ Yamagishi, J.² Masuko, T.³ Kobayashi, T.⁴

15
- 29144475179
- Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing
- Nov.
- M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, "Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing," IEICE Trans. Inf. Syst., vol.E88-D, no.11, pp. 2484-2491, Nov. 2005.
- (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.11 , pp. 2484-2491
- Tachibana, M.¹ Yamagishi, J.² Masuko, T.³ Kobayashi, T.⁴

16
- 51449114529
- A style control technique for HMM-based expressive speech synthesis
- Sep.
- T. Nose, J. Yamagishi, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis," IEICE Trans. Inf. Syst., vol.E90-D, no.9, pp. 1406-1413, Sep. 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.9 , pp. 1406-1413
- Nose, T.¹ Yamagishi, J.² Kobayashi, T.³

17
- 0030696416
- Voice characteristics conversion for HMM-based speech synthesis system
- Apr.
- T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Voice characteristics conversion for HMM-based speech synthesis system," in Proc. ICASSP'97, Apr. 1997, pp. 1611-1614.
- (1997) Proc. ICASSP'97 , pp. 1611-1614
- Masuko, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

18
- 0034842740
- Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
- May
- M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR," in Proc. ICASSP'01, May 2001, pp. 805-808.
- (2001) Proc. ICASSP'01 , pp. 805-808
- Tamura, M.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴

19
- 0142007308
- A training method of average voice model for HMM-based speech synthesis
- Aug.
- J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, and T.Kobayashi, "A training method of average voice model for HMM-based speech synthesis," IEICE Trans. Fundamentals, vol.E86-A, no.8, pp. 1956-1963, Aug. 2003.
- (2003) IEICE Trans. Fundamentals , vol.E86-A , Issue.8 , pp. 1956-1963
- Yamagishi, J.¹ Tamura, M.² Masuko, T.³ Tokuda, K.⁴ Kobayashi, T.⁵

20
- 33847129573
- Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
- Feb.
- J.Yamagishi and T.Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans. Inf. Syst., vol.E90-D, no.2, pp. 533-543, Feb. 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.2 , pp. 533-543
- Yamagishi, J.¹ Kobayashi, T.²

21
- 0007985533
- Speaker adaptation for HMM-based speech synthesis system using MLLR
- Nov.
- M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Speaker adaptation for HMM-based speech synthesis system using MLLR," in Proc. 3rd ESCA/COCOSDA Workshop Speech Synth., Nov. 1998, pp. 273-276.
- (1998) Proc. 3rd ESCA/COCOSDA Workshop Speech Synth. , pp. 273-276
- Tamura, M.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴

22
- 1842604575
- Voice characteristics conversion for HMM-based speech synthesis system using MAP-VFS
- Dec.
- T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Voice characteristics conversion for HMM-based speech synthesis system using MAP-VFS," (in Japanese) IEICE Trans., vol.J83-D-II, no.12, pp. 2509-2516, Dec. 2000.
- (2000) IEICE Trans. (in Japanese) , vol.J83-D-II , Issue.12 , pp. 2509-2516
- Masuko, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

23
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. Leggetter and P.Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol.9, no.2, pp. 171-185, 1995.
- (1995) Comput. Speech Lang. , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.¹ Woodland, P.²

24
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Apr.
- J. Gauvain and C. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol.2, no.2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.¹ Lee, C.²

25
- 0030124675
- Speaker adaptation based on transfer vector field smoothing using maximum a posteriori probability estimation
- M. Tonomura, T. Kosaka, and S. Matsunaga, "Speaker adaptation based on transfer vector field smoothing using maximum a posteriori probability estimation," Comput. Speech Lang., vol.10, no.2, pp. 117-132, 1995.
- (1995) Comput. Speech Lang. , vol.10 , Issue.2 , pp. 117-132
- Tonomura, M.¹ Kosaka, T.² Matsunaga, S.³

26
- 0031118076
- Vector-field-smoothed bayesian learning for fast and incremental speaker/telephone-channel adaptation
- J. Takahashi and S. Sagayama, "Vector-field-smoothed bayesian learning for fast and incremental speaker/telephone-channel adaptation," Comput. Speech Lang., vol.11, no.2, pp. 127-146, 1997.
- (1997) Comput. Speech Lang. , vol.11 , Issue.2 , pp. 127-146
- Takahashi, J.¹ Sagayama, S.²

27
- 0036522887
- Multi-space probability distribution HMM
- Mar.
- K. Tokuda, T. Masuko, N. Miyazaki, and T. Kobayashi, "Multi-space probability distribution HMM," IEICE Trans. Inf. Syst., vol.E85-D, no.3, pp. 455-464, Mar. 2002.
- (2002) IEICE Trans. Inf. Syst. , vol.E85-D , Issue.3 , pp. 455-464
- Tokuda, K.¹ Masuko, T.² Miyazaki, N.³ Kobayashi, T.⁴

28
- 85008066911
- Speaker adaptation of pitch and spectrum for HMM-based speech synthesis
- Apr.
- M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Speaker adaptation of pitch and spectrum for HMM-based speech synthesis," (in Japanese) IEICE Trans., vol.J85-D-II, no.4, pp. 545-553, Apr. 2002.
- (2002) IEICE Trans. (in Japanese) , vol.J85-D-II , Issue.4 , pp. 545-553
- Tamura, M.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴

29
- 44449177634
- A hidden semi-Markov model-based speech synthesis system
- May
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markov model-based speech synthesis system," IEICE Trans. Inf. & Syst., vol.E90-D, no.5, pp. 825-834, May 2007.
- (2007) IEICE Trans. Inf. & Syst. , vol.E90-D , Issue.5 , pp. 825-834
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

30
- 0002585974
- Variable duration models for speech
- J. Ferguson, "Variable duration models for speech," in Proc. Symp. Applicat. Hidden Markov Models to Text and Speech, 1980, pp. 143-179.
- (1980) Proc. Symp. Applicat. Hidden Markov Models to Text and Speech , pp. 143-179
- Ferguson, J.¹

31
- 0022234383
- Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition
- Mar.
- M. Russell and R. Moore, "Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition," in Proc. ICASSP'85, Mar. 1985, pp. 5-8.
- (1985) Proc. ICASSP'85 , pp. 5-8
- Russell, M.¹ Moore, R.²

32
- 0022685753
- CONTINUOUSLY VARIABLE DURATION HIDDEN MARKOV MODELS FOR AUTOMATIC SPEECH RECOGNITION.
- S. Levinson, "Continuously variable duration hidden Markov models for automatic speech recognition," Comput. Speech Lang., vol.1, no.1, pp. 29-45, 1986. (Pubitemid 17552445)
- (1986) Computer Speech and Language , vol.1 , Issue.1 , pp. 29-45
- Levinson, S.E.¹

33
- 0030362995
- A compact model for speaker-adaptive training
- Oct.
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. ICSLP'96, Oct. 1996, pp. 1137-1140.
- (1996) Proc. ICSLP'96 , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

34
- 0038042801
- A context clustering technique for average voice models
- Mar.
- J.Yamagishi, M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "A context clustering technique for average voice models," IEICE Trans. Inf. Syst., vol.E86-D, no.3, pp. 534-542, Mar. 2003.
- (2003) IEICE Trans. Inf. Syst. , vol.E86-D , Issue.3 , pp. 534-542
- Yamagishi, J.¹ Tamura, M.² Masuko, T.³ Tokuda, K.⁴ Kobayashi, T.⁵

35
- 70350485779
- HMM-based emotional speech synthesis using average emotion model
- Dec.
- L. Qin, Z. Ling, Y. Wu, B. Zhang, and R. Wang, "HMM-based emotional speech synthesis using average emotion model," in Proc. ISCSLP'06 (Springer LNAI Book), Dec. 2006, pp. 233-240.
- (2006) Proc. ISCSLP'06 (Springer LNAI Book) , pp. 233-240
- Qin, L.¹ Ling, Z.² Wu, Y.³ Zhang, B.⁴ Wang, R.⁵

36
- 33748468338
- New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer
- J. Latorre, K. Iwano, and S. Furui, "New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer," Speech Commun., vol.48, no.10, pp. 1227-1242, 2006.
- (2006) Speech Commun. , vol.48 , Issue.10 , pp. 1227-1242
- Latorre, J.¹ Iwano, K.² Furui, S.³

37
- 0029375590
- Speaker adaptation using constrained reestimation of Gaussian mixtures
- Sep.
- V. Digalakis, D. Rtischev, and L. Neumeyer, "Speaker adaptation using constrained reestimation of Gaussian mixtures," IEEE Trans. Speech Audio Process., vol.3, no.5, pp. 357-366, Sep. 1995.
- (1995) IEEE Trans. Speech Audio Process , vol.3 , Issue.5 , pp. 357-366
- Digalakis, V.¹ Rtischev, D.² Neumeyer, L.³

38
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol.12, no.2, pp. 75-98, 1998.
- (1998) Comput. Speech Lang. , vol.12 , Issue.2 , pp. 75-98
- Gales, M.¹

39
- 0035279111
- A structural Bayes approach to speaker adaptation
- Mar.
- K. Shinoda and C. Lee, "A structural Bayes approach to speaker adaptation," IEEE Trans. Speech Audio Process., vol.9, pp. 276-287, Mar. 2001.
- (2001) IEEE Trans. Speech Audio Process , vol.9 , pp. 276-287
- Shinoda, K.¹ Lee, C.²

40
- 0036461005
- Structural maximum a posteriori linear regression for fast HMM adaptation
- O. Shiohan, T. Myrvoll, and C. Lee, "Structural maximum a posteriori linear regression for fast HMM adaptation," Comput. Speech Lang., vol.16, no.3, pp. 5-24, 2002.
- (2002) Comput. Speech Lang. , vol.16 , Issue.3 , pp. 5-24
- Shiohan, O.¹ Myrvoll, T.² Lee, C.³

41
- 0030189744
- Speaker adaptation using combined transformation and Bayesian methods
- Jul.
- V. Digalakis and L. Neumeyer, "Speaker adaptation using combined transformation and Bayesian methods," IEEE Trans. Speech Audio Process., vol.4, no.3, pp. 294-300, Jul. 1996.
- (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.3 , pp. 294-300
- Digalakis, V.¹ Neumeyer, L.²

42
- 33745214429
- Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis
- Sep.
- J. Isogai, J. Yamagishi, and T. Kobayashi, "Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis," in Proc. Eurospeech'05, Sep. 2005, pp. 2597-2600.
- (2005) Proc. Eurospeech'05 , pp. 2597-2600
- Isogai, J.¹ Yamagishi, J.² Kobayashi, T.³

43
- 33947669452
- HSMM-based model adaptation algorithms for average-voice-based speech synthesis
- May
- J. Yamagishi, K. Ogata, Y. Nakano, J. Isogai, and T. Kobayashi, "HSMM-based model adaptation algorithms for average-voice-based speech synthesis," in Proc. ICASSP'06, May 2006, pp. 77-80.
- (2006) Proc. ICASSP'06 , pp. 77-80
- Yamagishi, J.¹ Ogata, K.² Nakano, Y.³ Isogai, J.⁴ Kobayashi, T.⁵

44
- 34547496746
- Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis
- Sep.
- Y. Nakano, M. Tachibana, J. Yamagishi, and T. Kobayashi, "Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis," in Proc. ICSLP'06, Sep. 2006, pp. 2286-2289.
- (2006) Proc. ICSLP'06 , pp. 2286-2289
- Nakano, Y.¹ Tachibana, M.² Yamagishi, J.³ Kobayashi, T.⁴

45
- 34547525896
- Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis
- Sep.
- K. Ogata, M. Tachibana, J. Yamagishi, and T. Kobayashi, "Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis," in Proc. ICSLP'06, Sep. 2006, pp. 1328-1331.
- (2006) Proc. ICSLP'06 , pp. 1328-1331
- Ogata, K.¹ Tachibana, M.² Yamagishi, J.³ Kobayashi, T.⁴

46
- 34547529978
- Model adaptation approach to speech synthesis with diverse voices and styles
- Apr.
- J. Yamagishi, T. Kobayashi, M. Tachibana, K. Ogata, and Y. Nakano, "Model adaptation approach to speech synthesis with diverse voices and styles," in Proc. ICASSP'07, Apr. 2007, pp. 1233-1236.
- (2007) Proc. ICASSP'07 , pp. 1233-1236
- Yamagishi, J.¹ Kobayashi, T.² Tachibana, M.³ Ogata, K.⁴ Nakano, Y.⁵

47
- 0020596154
- Cepstral analysis synthesis on the Mel frequency scale
- Apr.
- S. Imai, "Cepstral analysis synthesis on the Mel frequency scale," in Proc. ICASSP'83, Apr. 1983, pp. 93-96.
- (1983) Proc. ICASSP'83 , pp. 93-96
- Imai, S.¹

48
- 0001310760
- Spectral estimation of speech based on Mel-cepstral representation
- Aug.
- K. Tokuda, T. Kobayashi, T. Fukada, H. Saito, and S. Imai, "Spectral estimation of speech based on Mel-cepstral representation," (in Japanese) IEICE Trans. Fundamentals, vol.J74-A, no.8, pp. 1240-1248, Aug. 1991.
- (1991) IEICE Trans. Fundamentals (in Japanese) , vol.J74-A , Issue.8 , pp. 1240-1248
- Tokuda, K.¹ Kobayashi, T.² Fukada, T.³ Saito, H.⁴ Imai, S.⁵

49
- 70350480130
- [Online]. Available
- Speech Signal Processing Toolkit (SPTK) Version 3.1. 2007 [Online]. Available: http://www.sp-tk.sourceforge.net/
- (2007) Speech Signal Processing Toolkit (SPTK) Version 3.1

50
- 11144317887
- Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency
- Dec.
- D. Arifianto, T. Tanaka, T. Masuko, and T. Kobayashi, "Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency," IEICE Trans. Inf. Syst., vol.E87-D, no.12, pp. 2812-2820, Dec. 2004.
- (2004) IEICE Trans. Inf. Syst. , vol.E87-D , Issue.12 , pp. 2812-2820
- Arifianto, D.¹ Tanaka, T.² Masuko, T.³ Kobayashi, T.⁴

51
- 38549096029
- A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
- May
- T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol.E90-D, no.5, pp. 816-824, May 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
- Toda, T.¹ Tokuda, K.²

52
- 85133674021
- Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV
- Aug.
- J. Yamagishi, T. Kobayashi, S. Renals, S. King, H. Zen, T. Toda, and K. Tokuda, "Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV," in Proc. 6th ISCA Workshop Speech Synth., Aug. 2007, pp. 125-130.
- (2007) Proc. 6th ISCA Workshop Speech Synth. , pp. 125-130
- Yamagishi, J.¹ Kobayashi, T.² Renals, S.³ King, S.⁴ Zen, H.⁵ Toda, T.⁶ Tokuda, K.⁷

53
- 70350480131
- A speaker-adaptive HMM-based speech synthesis for the Blizzard Challenge 2007
- submitted for publication
- J. Yamagishi, T. Nose, H. Zen, T. Toda, K. Tokuda, S. King, and S. Renals, "A speaker-adaptive HMM-based speech synthesis for the Blizzard Challenge 2007," IEEE Audio, Speech, Lang. Process., 2008, submitted for publication.
- (2008) IEEE Audio, Speech, Lang. Process
- Yamagishi, J.¹ Nose, T.² Zen, H.³ Toda, T.⁴ Tokuda, K.⁵ King, S.⁶ Renals, S.⁷

54
- 0030263447
- Mean and variance adaptation within the MLLRframework
- M. Gales and P. Woodland, "Mean and variance adaptation within the MLLRframework," Comput. Speech Lang., vol.10, no.4, pp. 249-264, 1996.
- (1996) Comput. Speech Lang. , vol.10 , Issue.4 , pp. 249-264
- Gales, M.¹ Woodland, P.²

55
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- Series B
- A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., Series B, vol.39, no.1, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc. , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.¹ Laird, N.² Rubin, D.³

56
- 68249104241
- The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006
- Jun.
- H. Zen, T. Toda, and K. Tokuda, "The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006," IEICE Trans. Inf. Syst., vol.E91-D, no.6, pp. 1764-1773, Jun. 2008.
- (2008) IEICE Trans. Inf. Syst. , vol.E91-D , Issue.6 , pp. 1764-1773
- Zen, H.¹ Toda, T.² Tokuda, K.³

57
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- Mar.
- M. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Process., vol.7, no.2, pp. 272-281, Mar. 1999.
- (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.2 , pp. 272-281
- Gales, M.¹

58
- 84892187452
- Maximum likelihood modeling with Gaussian distributions for classification
- May
- R. Gopinath, "Maximum likelihood modeling with Gaussian distributions for classification," in Proc. ICASSP'98, May 1998, pp. 661-664.
- (1998) Proc. ICASSP'98 , pp. 661-664
- Gopinath, R.¹

59
- 0029769867
- Signal bias removal by maximum likelihood estimation for robust telephone speech recognition
- Jan.
- M. Rahim and B. Juang, "Signal bias removal by maximum likelihood estimation for robust telephone speech recognition," IEEE Trans. Speech Audio Process., vol.4, no.1, pp. 19-30, Jan. 1996.
- (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.1 , pp. 19-30
- Rahim, M.¹ Juang, B.²

60
- 0034853390
- Multiple-cluster adaptive training schemes
- May
- M. Gales, "Multiple-cluster adaptive training schemes," in Proc. ICASSP'01, May 2001, pp. 361-364.
- (2001) Proc. ICASSP'01 , pp. 361-364
- Gales, M.¹

61
- 0004043195
- Norwell, MA: Kluwer
- A. Gupta and T. Varga, Elliptically Contoured Models in Statistics. Norwell, MA: Kluwer, 1993.
- (1993) Elliptically Contoured Models in Statistics
- Gupta, A.¹ Varga, T.²

62
- 0030643678
- Improved Bayesian learning of hidden Markov models for speaker adaptation
- Apr.
- J. Chien, H.Wang, and C. Lee, "Improved Bayesian learning of hidden Markov models for speaker adaptation," in Proc. ICASSP'97, Apr. 1997, pp. 1027-1030.
- (1997) Proc. ICASSP'97 , pp. 1027-1030
- Chien, J.¹ Wang, H.² Lee, C.³

63
- 85016140477
- An adaptive algorithm for Mel-cepstral analysis of speech
- Mar.
- T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for Mel-cepstral analysis of speech," in Proc. ICASSP'92, Mar. 1992, pp. 137-140.
- (1992) Proc. ICASSP'92 , pp. 137-140
- Fukada, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

64
- 0033906251
- MDL-based context-dependent subword modeling for speech recognition
- Mar.
- K. Shinoda and T.Watanabe, "MDL-based context-dependent subword modeling for speech recognition," J. Acoust. Soc. Japan (E), vol.21, pp. 79-86, Mar. 2000.
- (2000) J. Acoust. Soc. Japan (E) , vol.21 , pp. 79-86
- Shinoda, K.¹ Watanabe, T.²

65
- 6644226630
- A large-scale Japanese speech database
- Nov.
- Y. Sagisaka, K. Takeda, M. Abel, S. Katagiri, T. Umeda, and H. Kuwabara, "A large-scale Japanese speech database," in Proc. ICSLP'96, Nov. 1990, pp. 1089-1092.
- (1990) Proc. ICSLP'96 , pp. 1089-1092
- Sagisaka, Y.¹ Takeda, K.² Abel, M.³ Katagiri, S.⁴ Umeda, T.⁵ Kuwabara, H.⁶

66
- 79952258981
- [Online]. Available:
- K. Tokuda, H. Zen, J. Yamagishi, T. Masuko, S. Sako, A. Black, and T. Nose, The HMM-Based Speech Synthesis System (HTS). [Online]. Available: http://www.hts.sp.nitech.ac.jp/
- The HMM-Based Speech Synthesis System (HTS)
- Tokuda, K.¹ Zen, H.² Yamagishi, J.³ Masuko, T.⁴ Sako, S.⁵ Black, A.⁶ Nose, T.⁷

67
- 85133720638
- The HMMbased speech synthesis system (HTS) version 2.0
- Aug.
- H. Zen, T. Nose, J. Yamagishi, S. Sako, and K. Tokuda, "The HMMbased speech synthesis system (HTS) version 2.0," in Proc. 6th ISCA Workshop Speech Synth., Aug. 2007, pp. 294-299.
- (2007) Proc. 6th ISCA Workshop Speech Synth. , pp. 294-299
- Zen, H.¹ Nose, T.² Yamagishi, J.³ Sako, S.⁴ Tokuda, K.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.