메뉴 건너뛰기




Volumn 17, Issue 1, 2009, Pages 66-83

Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm

Author keywords

Average voice; Hidden Markov model (HMM) based speech synthesis; Speaker adaptation; Speech synthesis; Voice conversion

Indexed keywords

ADAPTATION ALGORITHMS; AVERAGE VOICE; COVARIANCE MATRICES; HIDDEN MARKOV MODEL (HMM)-BASED SPEECH SYNTHESIS; HMM-BASED SPEECH SYNTHESIS; INDEPENDENT MODEL; LINEAR REGRESSION ALGORITHMS; MAP ADAPTATION; MAXIMUM A POSTERIORI; MEAN VECTOR; MODEL CONSTRUCTION; OBJECTIVE EVALUATION; PIECEWISE LINEAR REGRESSION; ROBUST ESTIMATION; SIMULTANEOUS USE; SPEAKER ADAPTATION; SPEECH SYNTHESIS SYSTEM; TRAINING DATA; TRANSFORM FUNCTION; VOICE CONVERSION;

EID: 67650854725     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2008.2006647     Document Type: Article
Times cited : (311)

References (67)
  • 1
    • 84966398940 scopus 로고
    • Optimising selection of units from speech database for concatenative synthesis
    • Sep.
    • A. Black and N. Cambpbell, "Optimising selection of units from speech database for concatenative synthesis," in Proc. EUROSPEECH'95, Sep. 1995, pp. 581-584.
    • (1995) Proc. EUROSPEECH'95 , pp. 581-584
    • Black, A.1    Cambpbell, N.2
  • 2
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • May
    • A. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP'96, May 1996, pp. 373-376.
    • (1996) Proc. ICASSP'96 , pp. 373-376
    • Hunt, A.1    Black, A.2
  • 3
    • 0032651722 scopus 로고    scopus 로고
    • A hidden Markov-model-based trainable speech synthesizer
    • R. Donovan and P. Woodland, "A hidden Markov-model-based trainable speech synthesizer," Comput. Speech Lang., vol.13, no.3, pp. 223-241, 1999.
    • (1999) Comput. Speech Lang. , vol.13 , Issue.3 , pp. 223-241
    • Donovan, R.1    Woodland, P.2
  • 5
    • 85006631929 scopus 로고    scopus 로고
    • Unit selection and emotional speech
    • Sep.
    • A. Black, "Unit selection and emotional speech," in Proc. Eurospeech' 03, Sep. 2003, pp. 1649-1652.
    • (2003) Proc. Eurospeech'03 , pp. 1649-1652
    • Black, A.1
  • 6
    • 0028996993 scopus 로고
    • Speech parameter generation from HMM using dynamic features
    • May
    • K. Tokuda, T. Kobayashi, and S. Imai, "Speech parameter generation from HMM using dynamic features," in Proc. ICASSP'95, May 1995, pp. 660-663.
    • (1995) Proc. ICASSP'95 , pp. 660-663
    • Tokuda, K.1    Kobayashi, T.2    Imai, S.3
  • 7
    • 0038582234 scopus 로고    scopus 로고
    • An algorithm for speech parameter generation from HMM using dynamic features
    • Mar.
    • K. Tokuda, T. Masuko, T. Kobayashi, and S. Imai, "An algorithm for speech parameter generation from HMM using dynamic features," (in Japanese) J. Acoust. Soc. Jpn., vol.53, no.3, pp. 192-200, Mar. 1997.
    • (1997) J. Acoust. Soc. Jpn. (in Japanese) , vol.53 , Issue.3 , pp. 192-200
    • Tokuda, K.1    Masuko, T.2    Kobayashi, T.3    Imai, S.4
  • 8
    • 0029725605 scopus 로고    scopus 로고
    • Speech synthesis using HMMs with dynamic features
    • May
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Speech synthesis using HMMs with dynamic features," in Proc. ICASSP'96, May 1996, pp. 389-392.
    • (1996) Proc. ICASSP'96 , pp. 389-392
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 9
    • 0002025578 scopus 로고    scopus 로고
    • HMM-based speech synthesis using dynamic features
    • Dec.
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "HMM-based speech synthesis using dynamic features," (in Japanese) IEICE Trans., vol.J79-D-II, no.12, pp. 2184-2190, Dec. 1996.
    • (1996) IEICE Trans. (in Japanese) , vol.J79-D-II , Issue.12 , pp. 2184-2190
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 10
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • Sep.
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. Eurospeech'99, Sep. 1999, pp. 2374-12350
    • (1999) Proc. Eurospeech'99 , pp. 2374-12350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 11
    • 7044242284 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis
    • Nov.
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis," (in Japanese) IEICE Trans., vol.J83-D-II, no.11, pp. 2099-2107, Nov. 2000.
    • (2000) IEICE Trans. (in Japanese) , vol.J83-D-II , Issue.11 , pp. 2099-2107
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 13
    • 24144497811 scopus 로고    scopus 로고
    • Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis
    • Mar.
    • J.Yamagishi, K. Onishi, T. Masuko, and T.Kobayashi, "Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol.E88-D, no.3, pp. 503-509, Mar. 2005.
    • (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 503-509
    • Yamagishi, J.1    Onishi, K.2    Masuko, T.3    Kobayashi, T.4
  • 14
    • 33645768204 scopus 로고    scopus 로고
    • A style adaptation technique for speech synthesis using HSMM and suprasegmental features
    • Mar.
    • M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style adaptation technique for speech synthesis using HSMM and suprasegmental features," IEICE Trans. Inf. Syst., vol.E89-D, no.3, pp. 1092-1099, Mar. 2006.
    • (2006) IEICE Trans. Inf. Syst. , vol.E89-D , Issue.3 , pp. 1092-1099
    • Tachibana, M.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 15
    • 29144475179 scopus 로고    scopus 로고
    • Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing
    • Nov.
    • M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, "Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing," IEICE Trans. Inf. Syst., vol.E88-D, no.11, pp. 2484-2491, Nov. 2005.
    • (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.11 , pp. 2484-2491
    • Tachibana, M.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 16
    • 51449114529 scopus 로고    scopus 로고
    • A style control technique for HMM-based expressive speech synthesis
    • Sep.
    • T. Nose, J. Yamagishi, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis," IEICE Trans. Inf. Syst., vol.E90-D, no.9, pp. 1406-1413, Sep. 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Kobayashi, T.3
  • 17
    • 0030696416 scopus 로고    scopus 로고
    • Voice characteristics conversion for HMM-based speech synthesis system
    • Apr.
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Voice characteristics conversion for HMM-based speech synthesis system," in Proc. ICASSP'97, Apr. 1997, pp. 1611-1614.
    • (1997) Proc. ICASSP'97 , pp. 1611-1614
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 18
    • 0034842740 scopus 로고    scopus 로고
    • Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
    • May
    • M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR," in Proc. ICASSP'01, May 2001, pp. 805-808.
    • (2001) Proc. ICASSP'01 , pp. 805-808
    • Tamura, M.1    Masuko, T.2    Tokuda, K.3    Kobayashi, T.4
  • 19
    • 0142007308 scopus 로고    scopus 로고
    • A training method of average voice model for HMM-based speech synthesis
    • Aug.
    • J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, and T.Kobayashi, "A training method of average voice model for HMM-based speech synthesis," IEICE Trans. Fundamentals, vol.E86-A, no.8, pp. 1956-1963, Aug. 2003.
    • (2003) IEICE Trans. Fundamentals , vol.E86-A , Issue.8 , pp. 1956-1963
    • Yamagishi, J.1    Tamura, M.2    Masuko, T.3    Tokuda, K.4    Kobayashi, T.5
  • 20
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
    • Feb.
    • J.Yamagishi and T.Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans. Inf. Syst., vol.E90-D, no.2, pp. 533-543, Feb. 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 22
    • 1842604575 scopus 로고    scopus 로고
    • Voice characteristics conversion for HMM-based speech synthesis system using MAP-VFS
    • Dec.
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Voice characteristics conversion for HMM-based speech synthesis system using MAP-VFS," (in Japanese) IEICE Trans., vol.J83-D-II, no.12, pp. 2509-2516, Dec. 2000.
    • (2000) IEICE Trans. (in Japanese) , vol.J83-D-II , Issue.12 , pp. 2509-2516
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 23
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. Leggetter and P.Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol.9, no.2, pp. 171-185, 1995.
    • (1995) Comput. Speech Lang. , vol.9 , Issue.2 , pp. 171-185
    • Leggetter, C.1    Woodland, P.2
  • 24
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Apr.
    • J. Gauvain and C. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol.2, no.2, pp. 291-298, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.1    Lee, C.2
  • 25
    • 0030124675 scopus 로고
    • Speaker adaptation based on transfer vector field smoothing using maximum a posteriori probability estimation
    • M. Tonomura, T. Kosaka, and S. Matsunaga, "Speaker adaptation based on transfer vector field smoothing using maximum a posteriori probability estimation," Comput. Speech Lang., vol.10, no.2, pp. 117-132, 1995.
    • (1995) Comput. Speech Lang. , vol.10 , Issue.2 , pp. 117-132
    • Tonomura, M.1    Kosaka, T.2    Matsunaga, S.3
  • 26
    • 0031118076 scopus 로고    scopus 로고
    • Vector-field-smoothed bayesian learning for fast and incremental speaker/telephone-channel adaptation
    • J. Takahashi and S. Sagayama, "Vector-field-smoothed bayesian learning for fast and incremental speaker/telephone-channel adaptation," Comput. Speech Lang., vol.11, no.2, pp. 127-146, 1997.
    • (1997) Comput. Speech Lang. , vol.11 , Issue.2 , pp. 127-146
    • Takahashi, J.1    Sagayama, S.2
  • 28
    • 85008066911 scopus 로고    scopus 로고
    • Speaker adaptation of pitch and spectrum for HMM-based speech synthesis
    • Apr.
    • M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Speaker adaptation of pitch and spectrum for HMM-based speech synthesis," (in Japanese) IEICE Trans., vol.J85-D-II, no.4, pp. 545-553, Apr. 2002.
    • (2002) IEICE Trans. (in Japanese) , vol.J85-D-II , Issue.4 , pp. 545-553
    • Tamura, M.1    Masuko, T.2    Tokuda, K.3    Kobayashi, T.4
  • 29
  • 31
    • 0022234383 scopus 로고
    • Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition
    • Mar.
    • M. Russell and R. Moore, "Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition," in Proc. ICASSP'85, Mar. 1985, pp. 5-8.
    • (1985) Proc. ICASSP'85 , pp. 5-8
    • Russell, M.1    Moore, R.2
  • 32
    • 0022685753 scopus 로고
    • CONTINUOUSLY VARIABLE DURATION HIDDEN MARKOV MODELS FOR AUTOMATIC SPEECH RECOGNITION.
    • S. Levinson, "Continuously variable duration hidden Markov models for automatic speech recognition," Comput. Speech Lang., vol.1, no.1, pp. 29-45, 1986. (Pubitemid 17552445)
    • (1986) Computer Speech and Language , vol.1 , Issue.1 , pp. 29-45
    • Levinson, S.E.1
  • 36
    • 33748468338 scopus 로고    scopus 로고
    • New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer
    • J. Latorre, K. Iwano, and S. Furui, "New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer," Speech Commun., vol.48, no.10, pp. 1227-1242, 2006.
    • (2006) Speech Commun. , vol.48 , Issue.10 , pp. 1227-1242
    • Latorre, J.1    Iwano, K.2    Furui, S.3
  • 37
    • 0029375590 scopus 로고
    • Speaker adaptation using constrained reestimation of Gaussian mixtures
    • Sep.
    • V. Digalakis, D. Rtischev, and L. Neumeyer, "Speaker adaptation using constrained reestimation of Gaussian mixtures," IEEE Trans. Speech Audio Process., vol.3, no.5, pp. 357-366, Sep. 1995.
    • (1995) IEEE Trans. Speech Audio Process , vol.3 , Issue.5 , pp. 357-366
    • Digalakis, V.1    Rtischev, D.2    Neumeyer, L.3
  • 38
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol.12, no.2, pp. 75-98, 1998.
    • (1998) Comput. Speech Lang. , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.1
  • 39
    • 0035279111 scopus 로고    scopus 로고
    • A structural Bayes approach to speaker adaptation
    • Mar.
    • K. Shinoda and C. Lee, "A structural Bayes approach to speaker adaptation," IEEE Trans. Speech Audio Process., vol.9, pp. 276-287, Mar. 2001.
    • (2001) IEEE Trans. Speech Audio Process , vol.9 , pp. 276-287
    • Shinoda, K.1    Lee, C.2
  • 40
    • 0036461005 scopus 로고    scopus 로고
    • Structural maximum a posteriori linear regression for fast HMM adaptation
    • O. Shiohan, T. Myrvoll, and C. Lee, "Structural maximum a posteriori linear regression for fast HMM adaptation," Comput. Speech Lang., vol.16, no.3, pp. 5-24, 2002.
    • (2002) Comput. Speech Lang. , vol.16 , Issue.3 , pp. 5-24
    • Shiohan, O.1    Myrvoll, T.2    Lee, C.3
  • 41
    • 0030189744 scopus 로고    scopus 로고
    • Speaker adaptation using combined transformation and Bayesian methods
    • Jul.
    • V. Digalakis and L. Neumeyer, "Speaker adaptation using combined transformation and Bayesian methods," IEEE Trans. Speech Audio Process., vol.4, no.3, pp. 294-300, Jul. 1996.
    • (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.3 , pp. 294-300
    • Digalakis, V.1    Neumeyer, L.2
  • 42
    • 33745214429 scopus 로고    scopus 로고
    • Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis
    • Sep.
    • J. Isogai, J. Yamagishi, and T. Kobayashi, "Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis," in Proc. Eurospeech'05, Sep. 2005, pp. 2597-2600.
    • (2005) Proc. Eurospeech'05 , pp. 2597-2600
    • Isogai, J.1    Yamagishi, J.2    Kobayashi, T.3
  • 43
    • 33947669452 scopus 로고    scopus 로고
    • HSMM-based model adaptation algorithms for average-voice-based speech synthesis
    • May
    • J. Yamagishi, K. Ogata, Y. Nakano, J. Isogai, and T. Kobayashi, "HSMM-based model adaptation algorithms for average-voice-based speech synthesis," in Proc. ICASSP'06, May 2006, pp. 77-80.
    • (2006) Proc. ICASSP'06 , pp. 77-80
    • Yamagishi, J.1    Ogata, K.2    Nakano, Y.3    Isogai, J.4    Kobayashi, T.5
  • 44
    • 34547496746 scopus 로고    scopus 로고
    • Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis
    • Sep.
    • Y. Nakano, M. Tachibana, J. Yamagishi, and T. Kobayashi, "Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis," in Proc. ICSLP'06, Sep. 2006, pp. 2286-2289.
    • (2006) Proc. ICSLP'06 , pp. 2286-2289
    • Nakano, Y.1    Tachibana, M.2    Yamagishi, J.3    Kobayashi, T.4
  • 45
    • 34547525896 scopus 로고    scopus 로고
    • Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis
    • Sep.
    • K. Ogata, M. Tachibana, J. Yamagishi, and T. Kobayashi, "Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis," in Proc. ICSLP'06, Sep. 2006, pp. 1328-1331.
    • (2006) Proc. ICSLP'06 , pp. 1328-1331
    • Ogata, K.1    Tachibana, M.2    Yamagishi, J.3    Kobayashi, T.4
  • 46
    • 34547529978 scopus 로고    scopus 로고
    • Model adaptation approach to speech synthesis with diverse voices and styles
    • Apr.
    • J. Yamagishi, T. Kobayashi, M. Tachibana, K. Ogata, and Y. Nakano, "Model adaptation approach to speech synthesis with diverse voices and styles," in Proc. ICASSP'07, Apr. 2007, pp. 1233-1236.
    • (2007) Proc. ICASSP'07 , pp. 1233-1236
    • Yamagishi, J.1    Kobayashi, T.2    Tachibana, M.3    Ogata, K.4    Nakano, Y.5
  • 47
    • 0020596154 scopus 로고
    • Cepstral analysis synthesis on the Mel frequency scale
    • Apr.
    • S. Imai, "Cepstral analysis synthesis on the Mel frequency scale," in Proc. ICASSP'83, Apr. 1983, pp. 93-96.
    • (1983) Proc. ICASSP'83 , pp. 93-96
    • Imai, S.1
  • 50
    • 11144317887 scopus 로고    scopus 로고
    • Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency
    • Dec.
    • D. Arifianto, T. Tanaka, T. Masuko, and T. Kobayashi, "Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency," IEICE Trans. Inf. Syst., vol.E87-D, no.12, pp. 2812-2820, Dec. 2004.
    • (2004) IEICE Trans. Inf. Syst. , vol.E87-D , Issue.12 , pp. 2812-2820
    • Arifianto, D.1    Tanaka, T.2    Masuko, T.3    Kobayashi, T.4
  • 51
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • May
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol.E90-D, no.5, pp. 816-824, May 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 52
    • 85133674021 scopus 로고    scopus 로고
    • Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV
    • Aug.
    • J. Yamagishi, T. Kobayashi, S. Renals, S. King, H. Zen, T. Toda, and K. Tokuda, "Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV," in Proc. 6th ISCA Workshop Speech Synth., Aug. 2007, pp. 125-130.
    • (2007) Proc. 6th ISCA Workshop Speech Synth. , pp. 125-130
    • Yamagishi, J.1    Kobayashi, T.2    Renals, S.3    King, S.4    Zen, H.5    Toda, T.6    Tokuda, K.7
  • 54
    • 0030263447 scopus 로고    scopus 로고
    • Mean and variance adaptation within the MLLRframework
    • M. Gales and P. Woodland, "Mean and variance adaptation within the MLLRframework," Comput. Speech Lang., vol.10, no.4, pp. 249-264, 1996.
    • (1996) Comput. Speech Lang. , vol.10 , Issue.4 , pp. 249-264
    • Gales, M.1    Woodland, P.2
  • 55
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • Series B
    • A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., Series B, vol.39, no.1, pp. 1-38, 1977.
    • (1977) J. R. Statist. Soc. , vol.39 , Issue.1 , pp. 1-38
    • Dempster, A.1    Laird, N.2    Rubin, D.3
  • 56
    • 68249104241 scopus 로고    scopus 로고
    • The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006
    • Jun.
    • H. Zen, T. Toda, and K. Tokuda, "The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006," IEICE Trans. Inf. Syst., vol.E91-D, no.6, pp. 1764-1773, Jun. 2008.
    • (2008) IEICE Trans. Inf. Syst. , vol.E91-D , Issue.6 , pp. 1764-1773
    • Zen, H.1    Toda, T.2    Tokuda, K.3
  • 57
    • 0032638856 scopus 로고    scopus 로고
    • Semi-tied covariance matrices for hidden Markov models
    • Mar.
    • M. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Process., vol.7, no.2, pp. 272-281, Mar. 1999.
    • (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.2 , pp. 272-281
    • Gales, M.1
  • 58
    • 84892187452 scopus 로고    scopus 로고
    • Maximum likelihood modeling with Gaussian distributions for classification
    • May
    • R. Gopinath, "Maximum likelihood modeling with Gaussian distributions for classification," in Proc. ICASSP'98, May 1998, pp. 661-664.
    • (1998) Proc. ICASSP'98 , pp. 661-664
    • Gopinath, R.1
  • 59
    • 0029769867 scopus 로고    scopus 로고
    • Signal bias removal by maximum likelihood estimation for robust telephone speech recognition
    • Jan.
    • M. Rahim and B. Juang, "Signal bias removal by maximum likelihood estimation for robust telephone speech recognition," IEEE Trans. Speech Audio Process., vol.4, no.1, pp. 19-30, Jan. 1996.
    • (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.1 , pp. 19-30
    • Rahim, M.1    Juang, B.2
  • 60
    • 0034853390 scopus 로고    scopus 로고
    • Multiple-cluster adaptive training schemes
    • May
    • M. Gales, "Multiple-cluster adaptive training schemes," in Proc. ICASSP'01, May 2001, pp. 361-364.
    • (2001) Proc. ICASSP'01 , pp. 361-364
    • Gales, M.1
  • 62
    • 0030643678 scopus 로고    scopus 로고
    • Improved Bayesian learning of hidden Markov models for speaker adaptation
    • Apr.
    • J. Chien, H.Wang, and C. Lee, "Improved Bayesian learning of hidden Markov models for speaker adaptation," in Proc. ICASSP'97, Apr. 1997, pp. 1027-1030.
    • (1997) Proc. ICASSP'97 , pp. 1027-1030
    • Chien, J.1    Wang, H.2    Lee, C.3
  • 63
    • 85016140477 scopus 로고
    • An adaptive algorithm for Mel-cepstral analysis of speech
    • Mar.
    • T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for Mel-cepstral analysis of speech," in Proc. ICASSP'92, Mar. 1992, pp. 137-140.
    • (1992) Proc. ICASSP'92 , pp. 137-140
    • Fukada, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 64
    • 0033906251 scopus 로고    scopus 로고
    • MDL-based context-dependent subword modeling for speech recognition
    • Mar.
    • K. Shinoda and T.Watanabe, "MDL-based context-dependent subword modeling for speech recognition," J. Acoust. Soc. Japan (E), vol.21, pp. 79-86, Mar. 2000.
    • (2000) J. Acoust. Soc. Japan (E) , vol.21 , pp. 79-86
    • Shinoda, K.1    Watanabe, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.