메뉴 건너뛰기




Volumn , Issue , 2009, Pages 1759-1762

Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems

Author keywords

Context clustering; Decision tree; Embedded device; HMM; MDL criterion; Speech synthesis

Indexed keywords

CLUSTERING TECHNIQUES; COVARIANCE MATRICES; EMBEDDED DEVICE; EMPIRICAL KNOWLEDGE; HMM; HMM-BASED SPEECH SYNTHESIS; MDL CRITERION; MEAN VECTOR; SPEECH WAVEFORMS; SUBJECTIVE LISTENING TEST; SYNTHESIZED SPEECH;

EID: 70450172128     PISSN: None     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (11)

References (21)
  • 1
    • 0342918775 scopus 로고
    • CHATR: A generic speech synthesis system
    • A. W. Black and P. Taylor, "CHATR: a generic speech synthesis system," in Proc. COLING94, 1994.
    • (1994) Proc. COLING94
    • Black, A.W.1    Taylor, P.2
  • 2
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • A. Hunt and A. W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP, 1996, pp. 373-376.
    • (1996) Proc. ICASSP , pp. 373-376
    • Hunt, A.1    Black, A.W.2
  • 3
    • 0028996983 scopus 로고
    • Automatic speech synthesizer parameter estimation using HMMs
    • R. E. Donovan and P. C. Woodland, "Automatic speech synthesizer parameter estimation using HMMs," in Proc. ICASSP, 1995, pp. 640-643.
    • (1995) Proc. ICASSP , pp. 640-643
    • Donovan, R.E.1    Woodland, P.C.2
  • 4
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," Proc. Eurospeech, pp.2347-2350, 1999.
    • (1999) Proc. Eurospeech , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 5
    • 0025419316 scopus 로고
    • Context-Dependent Phonetic Hidden Markov models for Speaker-Independent Continuous Speech Recognition
    • K. F. Lee, "Context-Dependent Phonetic Hidden Markov models for Speaker-Independent Continuous Speech Recognition," IEEE Trans. Acoustic Speech and Signal Processing, vol. 38, no. 4, pp. 599-609, 1990.
    • (1990) IEEE Trans. Acoustic Speech and Signal Processing , vol.38 , Issue.4 , pp. 599-609
    • Lee, K.F.1
  • 6
    • 85013744934 scopus 로고
    • A Successive State Splitting Algorithm for Efficient Allophone Modeling
    • J. Takami and S. Sagayama, "A Successive State Splitting Algorithm for Efficient Allophone Modeling," Proc. ICASSP'92, pp. 573-576, 1992.
    • (1992) Proc. ICASSP'92 , pp. 573-576
    • Takami, J.1    Sagayama, S.2
  • 8
    • 0027153655 scopus 로고
    • Predicting Unseen Triphones with Senones
    • M. Y. Hwang, X. Huang, and F. Alleva, "Predicting Unseen Triphones with Senones," Proc. ICASSP'93, pp .311-314, 1993.
    • (1993) Proc. ICASSP'93 , pp. 311-314
    • Hwang, M.Y.1    Huang, X.2    Alleva, F.3
  • 9
    • 0030715097 scopus 로고    scopus 로고
    • M. Ostendorf and H .Singer, HMM topology design using maximum likelihood successive state splitting, Computer Speech Language, 1, no. 1, pp. 17-41, 1997.
    • M. Ostendorf and H .Singer, "HMM topology design using maximum likelihood successive state splitting," Computer Speech Language, vol. 1, no. 1, pp. 17-41, 1997.
  • 11
    • 0033906251 scopus 로고    scopus 로고
    • K. Shinoda and T. Watanabe, MDL-based Context-Dependent Subword Modeling for Speech Recognition, J. Acoust. Soc. Jpn.(E), 21, no. 2, pp. 79-8-6, 2000.
    • K. Shinoda and T. Watanabe, "MDL-based Context-Dependent Subword Modeling for Speech Recognition," J. Acoust. Soc. Jpn.(E), vol .21, no. 2, pp. 79-8-6, 2000.
  • 12
    • 0032658258 scopus 로고    scopus 로고
    • Decision Tree State Tying based on Penalized Bayesain Information Criterion
    • W. Chou and W. Reichl, "Decision Tree State Tying based on Penalized Bayesain Information Criterion," Proc. ICASSP'99, pp. 345-348, 1999.
    • (1999) Proc. ICASSP'99 , pp. 345-348
    • Chou, W.1    Reichl, W.2
  • 13
    • 70450176742 scopus 로고    scopus 로고
    • Training of Shared States in Hidden Markov Model Based on Bayesian Approach IEICE technical report
    • S.Watanabe, Y. Minami, A. Nakamura, and N.Ueda, "Training of Shared States in Hidden Markov Model Based on Bayesian Approach" IEICE technical report. Speech, vol. 102, no. 35, pp. 43-48, 2002.
    • (2002) Speech , vol.102 , Issue.35 , pp. 43-48
    • Watanabe, S.1    Minami, Y.2    Nakamura, A.3    Ueda, N.4
  • 14
    • 70450151807 scopus 로고    scopus 로고
    • T. Kato, S. Kuroiwa, T. Shimizu, and N. Higuchi, Tree-based Clustering for Gaussian Mixture HMMs, IEICE Trans. J83-D-II, no. 11, pp. 2128-2136, 2000.
    • T. Kato, S. Kuroiwa, T. Shimizu, and N. Higuchi, "Tree-based Clustering for Gaussian Mixture HMMs," IEICE Trans. vol. J83-D-II, no. 11, pp. 2128-2136, 2000.
  • 15
    • 27844487036 scopus 로고    scopus 로고
    • Context Clustering for Triphone-based Speech Recognition,
    • Master Thesis, Cambridge University
    • H. J. Nock, "Context Clustering for Triphone-based Speech Recognition," Master Thesis, Cambridge University, 1996.
    • (1996)
    • Nock, H.J.1
  • 16
    • 0025475528 scopus 로고    scopus 로고
    • A. Kuramatsu, K. Takeda, Y. Sagisaka, S. Katagiri, H .Kawabara, and K. Shikano, ATR Japanese speech database as a tool of speech recognition and synthesis, Speech Communication, 9, pp. 357-363, 1990.
    • A. Kuramatsu, K. Takeda, Y. Sagisaka, S. Katagiri, H .Kawabara, and K. Shikano, "ATR Japanese speech database as a tool of speech recognition and synthesis," Speech Communication, vol. 9, pp. 357-363, 1990.
  • 17
    • 0038000318 scopus 로고
    • Spectral Estimation of Speech by Mel-Generalized Cepstral Analysis
    • K. Tokuda, T. Kobayashi, T. Chiba, and S. Imai, "Spectral Estimation of Speech by Mel-Generalized Cepstral Analysis," IEICE Trans. vol. 75-A, no. 7, pp. 1124-1134, 1992.
    • (1992) IEICE Trans , vol.75-A , Issue.7 , pp. 1124-1134
    • Tokuda, K.1    Kobayashi, T.2    Chiba, T.3    Imai, S.4
  • 18
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, M. K. Ikuyo, A. Cheneigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Communication, 27, pp. 187-207, 1999.
    • (1999) Speech Communication , vol.27 , pp. 187-207
    • Kawahara, H.1    Ikuyo, M.K.2    Cheneigne, A.3
  • 19
    • 44449177634 scopus 로고    scopus 로고
    • H. Zen, T. Masuko, K. Tokuda, T. Kobayashi, T. Kitamura, A Hidden Semi-Markov Model-Based Speech Synthesis System, IEICE Trans. Inf. & Sys., 90D, no. 5, pp. 825-834, 2007.
    • H. Zen, T. Masuko, K. Tokuda, T. Kobayashi, T. Kitamura, "A Hidden Semi-Markov Model-Based Speech Synthesis System," IEICE Trans. Inf. & Sys., vol. 90D, no. 5, pp. 825-834, 2007.
  • 20
    • 33745200051 scopus 로고    scopus 로고
    • Speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • T. Toda, K. Tokuda, "Speech parameter generation algorithm considering global variance for HMM-based speech synthesis," Inter-speech 2005, pp. 2801-2804, 2005.
    • (2005) Inter-speech , pp. 2801-2804
    • Toda, T.1    Tokuda, K.2
  • 21
    • 0032638856 scopus 로고    scopus 로고
    • Semi-tied covariance matrices for hidden Markov models
    • M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, 1999.
    • (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
    • Gales, M.J.F.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.