메뉴 건너뛰기




Volumn 21, Issue 2, 2013, Pages 280-290

A unified trajectory tiling approach to high quality speech rendering

Author keywords

Cross lingual; speech synthesis; trajectory tiling; voice transformation

Indexed keywords

CROSS-LINGUAL; HIGH QUALITY; SPEECH DATABASE; SUBJECTIVE EVALUATIONS; SYNTHESIZED SPEECH; UNIFIED ALGORITHM; WAVE FORMS;

EID: 84871382567     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2221460     Document Type: Article
Times cited : (41)

References (45)
  • 1
    • 84871384055 scopus 로고    scopus 로고
    • EMIME [Online] Available
    • EMIME [Online]. Available: http://www. emime. org
  • 2
    • 34547612590 scopus 로고    scopus 로고
    • HMM-based hierarchical unit selection combining Kullback-Leibler divergence with likelihood criterion
    • Z.-H. Ling and R.-H. Wang, "HMM-based hierarchical unit selection combining Kullback-Leibler divergence with likelihood criterion, " in Proc. ICASSP., 2007, pp. 1245-1248.
    • (2007) Proc. ICASSP , pp. 1245-1248
    • Ling, Z.-H.1    Wang, R.-H.2
  • 3
    • 78049399368 scopus 로고    scopus 로고
    • Rich-context unit selection (RUS) approach to high quality TTS
    • Z.-J. Yan, Y. Qian, and F. K. Soong, "Rich-context unit selection (RUS) approach to high quality TTS, " in Proc. ICASSP, 2010, pp. 4798-4801.
    • (2010) Proc. ICASSP , pp. 4798-4801
    • Yan, Z.-J.1    Qian, Y.2    Soong, F.K.3
  • 6
    • 80051658497 scopus 로고    scopus 로고
    • Utilization of an HMM-based feature generation module in 5 ms segment concatenative speech synthesis
    • T. Hirai, J. Yamagishi, and S. Tenpaku, "Utilization of an HMM-based feature generation module in 5 ms segment concatenative speech synthesis, " in Proc. ISCA SSW6, 2007.
    • (2007) Proc. ISCA SSW6
    • Hirai, T.1    Yamagishi, J.2    Tenpaku, S.3
  • 7
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 8
    • 70450208175 scopus 로고    scopus 로고
    • Local minimum generation error criterion for hybrid HMM speech synthesis
    • X. Gonzalvo, A. Gutkin, J. C. Socoró, I. Iriondo, and P. Taylor, "Local minimum generation error criterion for hybrid HMM speech synthesis, " in Proc. Interspeech, 2009, pp. 416-419.
    • (2009) Proc. Interspeech , pp. 416-419
    • Gonzalvo, X.1    Gutkin, A.2    Socoró, J.C.3    Iriondo, I.4    Taylor, P.5
  • 9
    • 70450161678 scopus 로고    scopus 로고
    • Rich context modeling for high quality HMM-Based TTS
    • Z.-J. Yan, Y. Qian, and F. K. Soong, "Rich context modeling for high quality HMM-Based TTS, " in Proc. Interspeech, 2009.
    • (2009) Proc. Interspeech
    • Yan, Z.-J.1    Qian, Y.2    Soong, F.K.3
  • 10
    • 34547526960 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • A. W. Black, H. Zen, and K. Tokuda, "Statistical parametric speech synthesis, " in Proc. ICASSP, 2007, pp. 1229-1232.
    • (2007) Proc. ICASSP , pp. 1229-1232
    • Black, A.W.1    Zen, H.2    Tokuda, K.3
  • 11
    • 33846410497 scopus 로고    scopus 로고
    • Speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • T. Toda and K. Tokuda, "Speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " in Proc. Interspeech, 2005.
    • (2005) Proc. Interspeech
    • Toda, T.1    Tokuda, K.2
  • 12
    • 67650854725 scopus 로고    scopus 로고
    • Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
    • Jan
    • J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm, " IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 1, pp. 66-83, Jan. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process , vol.17 , Issue.1 , pp. 66-83
    • Yamagishi, J.1    Kobayashi, T.2    Nakano, Y.3    Ogata, K.4    Isogai, J.5
  • 13
    • 85008020260 scopus 로고    scopus 로고
    • A cross-language state sharing and mapping approach to bilingual (Mandarin-English) TTS
    • Aug
    • Y. Qian, H. Liang, and F. K. Soong, "A cross-language state sharing and mapping approach to bilingual (Mandarin-English) TTS, " IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 6, pp. 1231-1239, Aug. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process , vol.17 , Issue.6 , pp. 1231-1239
    • Qian, Y.1    Liang, H.2    Soong, F.K.3
  • 14
    • 70450192740 scopus 로고    scopus 로고
    • State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
    • Y.-J. Wu, Y. Nankaku, and K. Tokuda, "State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis, " in Proc. Interspeech, 2009, pp. 528-531.
    • (2009) Proc. Interspeech , pp. 528-531
    • Wu, Y.-J.1    Nankaku, Y.2    Tokuda, K.3
  • 15
    • 84859780529 scopus 로고    scopus 로고
    • Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis usingKLD-based transform mapping
    • Jul.
    • K. Oura, J. Yamagishi, M. Wester, S. King, and K. Tokuda, "Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis usingKLD-based transform mapping, " Speech Commun. vol. 54, no. 6, pp. 704-714, Jul. 2012.
    • (2012) Speech Commun , vol.54 , Issue.6 , pp. 704-714
    • Oura, K.1    Yamagishi, J.2    Wester, M.3    King, S.4    Tokuda, K.5
  • 16
    • 70349218937 scopus 로고    scopus 로고
    • State mapping for cross-language speaker adaptation in TTS
    • Y.-N. Chen, Y. Jiao, Y. Qian, and F. K. Soong, "State mapping for cross-language speaker adaptation in TTS, " in Proc. ICASSP, 2009, pp. 4273-4276.
    • (2009) Proc. ICASSP , pp. 4273-4276
    • Chen, Y.-N.1    Jiao, Y.2    Qian, Y.3    Soong, F.K.4
  • 17
    • 78049411002 scopus 로고    scopus 로고
    • Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction
    • M. Gibson, T. Hirsimaki, R. Karhila, M. Kurimo, andW. Byrne, "Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction, " in Proc. ICASSP, 2010, pp. 4642-4645.
    • (2010) Proc. ICASSP , pp. 4642-4645
    • Gibson, M.1    Hirsimaki, T.2    Karhila, R.3    Kurimo, M.4    Byrne, W.5
  • 18
    • 84865786646 scopus 로고    scopus 로고
    • Phonological knowledge guided HMM state mapping for cross-Lingual speaker adaptation
    • H. Liang and J. Dines, "Phonological knowledge guided HMM state mapping for cross-Lingual speaker adaptation, " in Proc. Interspeech, 2011, pp. 1825-1828.
    • (2011) Proc. Interspeech , pp. 1825-1828
    • Liang, H.1    Dines, J.2
  • 19
    • 80051608660 scopus 로고    scopus 로고
    • A frame mapping based HMM approach to cross-lingual voice transformation
    • Y. Qian, J. Xu, and F. K. Soong, "A frame mapping based HMM approach to cross-lingual voice transformation, " in Proc. ICASSP, 2011, pp. 5120-5123.
    • (2011) Proc. ICASSP , pp. 5120-5123
    • Qian, Y.1    Xu, J.2    Soong, F.K.3
  • 20
    • 0001810975 scopus 로고
    • Line spectrum representation of linear predictive coefficients of speech signals
    • F. Itakura, "Line spectrum representation of linear predictive coefficients of speech signals, " J. Acoust. Soc. Amer., vol. 57, p. S35, 1975.
    • (1975) J. Acoust. Soc. Amer , vol.57
    • Itakura, F.1
  • 21
    • 0002557614 scopus 로고
    • Line spectrum pair (LSP) and speech data compression
    • F. K. Soong and B. H. Juang, "Line spectrum pair (LSP) and speech data compression, " in Proc. ICASSP, 1984, pp. 37-40.
    • (1984) Proc. ICASSP , pp. 37-40
    • Soong, F.K.1    Juang, B.H.2
  • 22
    • 2942710378 scopus 로고
    • Linear prediction voice synthesizers: Line spectrum pairs (LSP) is the newest of the several techniques
    • H. Wakita, "Linear prediction voice synthesizers: Line spectrum pairs (LSP) is the newest of the several techniques, " Speech Technol., vol. 1, pp. 17-22, 1981.
    • (1981) Speech Technol , vol.1 , pp. 17-22
    • Wakita, H.1
  • 23
    • 38249015166 scopus 로고
    • On the use of line spectral frequency parameters for speech recognition
    • K. K. Paliwal, "On the use of line spectral frequency parameters for speech recognition, " Digital Signal Process., vol. 2, pp. 80-87, 1992.
    • (1992) Digital Signal Process , vol.2 , pp. 80-87
    • Paliwal, K.K.1
  • 24
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using pitch-adaptive time-frequency smoothing and instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, andA. deCheveigne, "Restructuring speech representations using pitch-adaptive time-frequency smoothing and instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech Commun., vol. 27, pp. 187-207, 1999.
    • (1999) Speech Commun , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    DeCheveigne, A.3
  • 27
    • 33846429403 scopus 로고    scopus 로고
    • Minimum generation error training for HMM-based speech synthesis
    • Y.-J. Wu and R. H. Wang, "Minimum generation error training for HMM-based speech synthesis, " in Proc. ICASSP, 2006, pp. 89-92.
    • (2006) Proc. ICASSP , pp. 89-92
    • Wu, Y.-J.1    Wang, R.H.2
  • 28
    • 70450169782 scopus 로고    scopus 로고
    • A minimum v/u error approach to F0 generation in HMM-based TTS
    • Y. Qian, F. K. Soong, M.-M. Wang, and Z.-Z. Wu, "A minimum v/u error approach to F0 generation in HMM-based TTS, " in Proc. Interspeech, 2009.
    • (2009) Proc. Interspeech
    • Qian, Y.1    Soong, F.K.2    Wang, M.-M.3    Wu, Z.-Z.4
  • 29
    • 78049381954 scopus 로고    scopus 로고
    • VTLN adaptation for statistical speech synthesis
    • L. Saheer, P. N. Garner, J. Dines, and H. Liang, "VTLN adaptation for statistical speech synthesis, " in Proc. ICASSP, 2010, pp. 4838-4841.
    • (2010) Proc. ICASSP , pp. 4838-4841
    • Saheer, L.1    Garner, P.N.2    Dines, J.3    Liang, H.4
  • 30
    • 79959837023 scopus 로고    scopus 로고
    • Formantbased frequency warping for improving speaker adaptation in HMM TTS
    • X. Zhuang, Y. Qian, F. K. Soong, Y.-J. Wu, and B. Zhang, "Formantbased frequency warping for improving speaker adaptation in HMM TTS, " in Proc. Interspeech, 2010, pp. 817-820.
    • (2010) Proc. Interspeech , pp. 817-820
    • Zhuang, X.1    Qian, Y.2    Soong, F.K.3    Wu, Y.-J.4    Zhang, B.5
  • 31
    • 0026400231 scopus 로고
    • Robust and efficient quantization of speech LSP parameters using structured vector quantizers
    • R. Laroia, N. Phamdo, and N. Farvardin, "Robust and efficient quantization of speech LSP parameters using structured vector quantizers, " in Proc. ICASSP, 1991, pp. 641-644.
    • (1991) Proc. ICASSP , pp. 641-644
    • Laroia, R.1    Phamdo, N.2    Farvardin, N.3
  • 32
    • 0035478160 scopus 로고    scopus 로고
    • A new distortion measure for spectral quantization based on the LSP intermodal interlacing property
    • M. S. Lee, H. K. Kim, and H. S. Lee, "A new distortion measure for spectral quantization based on the LSP intermodal interlacing property, " Speech Commun., vol. 35, pp. 191-201, 2001.
    • (2001) Speech Commun , vol.35 , pp. 191-201
    • Lee, M.S.1    Kim, H.K.2    Lee, H.S.3
  • 33
    • 77249139677 scopus 로고    scopus 로고
    • An HMM-based Mandarin Chinese text-to-speech system
    • Springer LNAI
    • Y. Qian, F. K. Soong, Y. N. Chen, and M. Chu, "An HMM-based Mandarin Chinese text-to-speech system, " in Proc. ISCSLP, 2006, Springer LNAI Vol. 4274, pp. 223-232. .
    • Proc. ISCSLP 2006 , vol.4274 , pp. 223-232
    • Qian, Y.1    Soong, F.K.2    Chen, Y.N.3    Chu, M.4
  • 34
    • 0003418124 scopus 로고
    • Acoustic theory of speech production
    • Mouton
    • G. Fant, Acoustic Theory of Speech Production. The Hague, Netherlands: Mouton, 1960.
    • (1960) The Hague Netherlands
    • Fant, G.1
  • 39
    • 84871373443 scopus 로고    scopus 로고
    • HTT-Based TTS, [Online] Available
    • Demos of Synthesized Sentences, HTT-Based TTS, [Online]. Available: http://research. microsoft. com/en-us/projects/htt/default. aspx
    • Demos of Synthesized Sentences
  • 42
    • 84871361828 scopus 로고    scopus 로고
    • [Online] Available
    • [Online]. Available: http://www. synsig. org/index. php/Blizzard- Challenge-2010
  • 44
    • 84863484159 scopus 로고    scopus 로고
    • Kullback-Leibler divergence between two hidden Markov models
    • Tech. Rep.
    • P. Liu and F. K. Soong, "Kullback-Leibler divergence between two hidden Markov models, " Microsoft Research Asia, 2005, Tech. Rep. .
    • (2005) Microsoft Research Asia
    • Liu, P.1    Soong, F.K.2
  • 45
    • 84871376257 scopus 로고    scopus 로고
    • [Online] Available
    • Cross-Lingual Voice Transformation [Online]. Available: http://research. microsoft. com/en-us/projects/mixedlangtts/default. aspx
    • Cross-Lingual Voice Transformation


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.