메뉴 건너뛰기




Volumn 20, Issue 6, 2012, Pages 1713-1724

Statistical parametric speech synthesis based on speaker and language factorization

Author keywords

Hidden Markov models (HMMs); Speaker and language factorization; Statistical parametric speech synthesis

Indexed keywords

HIDDEN MARKOV MODELS (HMMS); IN-BUILDINGS; LANGUAGE FACTORIZATION; MULTIPLE LANGUAGES; RECOGNITION SYSTEMS; SPEAKER CHARACTERISTICS;

EID: 84859765673     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2187195     Document Type: Article
Times cited : (97)

References (42)
  • 3
    • 84865713971 scopus 로고    scopus 로고
    • Crowdsourcing preference tests, and how to detect cheating
    • S. Buchholz and J. Latorre, "Crowdsourcing preference tests, and how to detect cheating," in Proc. Interspeech, 2011, pp. 3053-3056.
    • (2011) Proc. Interspeech , pp. 3053-3056
    • Buchholz, S.1    Latorre, J.2
  • 4
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998. (Pubitemid 128383747)
    • (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 5
    • 0034227757 scopus 로고    scopus 로고
    • Cluster adaptive training of hidden markov models
    • Jul
    • M. Gales, "Cluster adaptive training of hidden Markov models," IEEE Trans. Speech Audio Process., vol. 8, no. 4, pp. 417-428, Jul. 2000.
    • (2000) IEEE Trans. Speech Audio Process , vol.8 , Issue.4 , pp. 417-428
    • Gales, M.1
  • 6
    • 84962787636 scopus 로고    scopus 로고
    • Acoustic factorisation
    • M. Gales, "Acoustic factorisation," in Proc. ASRU, 2001, pp. 77-80.
    • (2001) Proc. ASRU , pp. 77-80
    • Gales, M.1
  • 8
    • 33748468338 scopus 로고    scopus 로고
    • New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer
    • DOI 10.1016/j.specom.2006.05.003, PII S0167639306000483
    • J. Latorre, K. Iwano, and S. Furui, "New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer," Speech Commun., vol. 48, no. 10, pp. 1227-1242, 2006. (Pubitemid 44353817)
    • (2006) Speech Communication , vol.48 , Issue.10 , pp. 1227-1242
    • Latorre, J.1    Iwano, K.2    Furui, S.3
  • 9
    • 79959843446 scopus 로고    scopus 로고
    • An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation
    • H. Liang and J. Dines, "An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation," in Proc. Interspeech, 2010, pp. 622-625.
    • (2010) Proc. Interspeech , pp. 622-625
    • Liang, H.1    Dines, J.2
  • 10
    • 51449118125 scopus 로고    scopus 로고
    • Acoustic modeling with contextual additive structure for HMM-based speech recognition
    • Y. Nankaku, K. Nakamura, H. Zen, and K. Tokuda, "Acoustic modeling with contextual additive structure for HMM-based speech recognition," in Proc. ICASSP, 2008, pp. 4469-4472.
    • (2008) Proc. ICASSP , pp. 4469-4472
    • Nankaku, Y.1    Nakamura, K.2    Zen, H.3    Tokuda, K.4
  • 12
    • 78651062051 scopus 로고    scopus 로고
    • Cross-lingual speaker adaptation for HMM-based speech synthesis considering differences between language-dependent average voices
    • X. Peng, K. Oura, Y. Nankaku, and K. Tokuda, "Cross-lingual speaker adaptation for HMM-based speech synthesis considering differences between language-dependent average voices," in Proc. ICSP, 2010, pp. 605-608.
    • (2010) Proc. ICSP , pp. 605-608
    • Peng, X.1    Oura, K.2    Nankaku, Y.3    Tokuda, K.4
  • 13
    • 85008020260 scopus 로고    scopus 로고
    • A cross-language state sharing and mapping approach to bilingual (Mandarin-English) TTS
    • Aug
    • Y. Qian, H. Liang, and F. Soong, "A cross-language state sharing and mapping approach to bilingual (Mandarin-English) TTS," IEEE Trans. Audio Speech Lang. Process., vol. 17, no. 6, pp. 1231-1239, Aug. 2009.
    • (2009) IEEE Trans. Audio Speech Lang. Process , vol.17 , Issue.6 , pp. 1231-1239
    • Qian, Y.1    Liang, H.2    Soong, F.3
  • 15
    • 1642370513 scopus 로고    scopus 로고
    • Solving unsymmetric sparse systems of linear equations with PARDISO
    • O. Schenk and K. Gärtner, "Solving unsymmetric sparse systems of linear equations with PARDISO," J. Future Gen. Comput. Syst., vol. 20, no. 3, pp. 475-487, 2004.
    • (2004) J. Future Gen. Comput. Syst , vol.20 , Issue.3 , pp. 475-487
    • Schenk, O.1    Gärtner, K.2
  • 16
    • 85009274666 scopus 로고    scopus 로고
    • Globalphone: A multilingual speech and text database developed at Karlsruhe University
    • T. Schultz, "Globalphone: A multilingual speech and text database developed at Karlsruhe University," in Proc. ICSLP, 2002, pp. 345-348.
    • (2002) Proc. ICSLP , pp. 345-348
    • Schultz, T.1
  • 17
    • 84865783757 scopus 로고    scopus 로고
    • Separating speaker and environmental variability using factored transforms
    • M. Seltzer and A. Acero, "Separating speaker and environmental variability using factored transforms," in Proc. Interspeech, 2011, pp. 1097-1100.
    • (2011) Proc. Interspeech , pp. 1097-1100
    • Seltzer, M.1    Acero, A.2
  • 19
    • 85135145174 scopus 로고    scopus 로고
    • Acoustic modeling based on the MDL criterion for speech recognition
    • K. Shinoda and T. Watanabe, "Acoustic modeling based on the MDL criterion for speech recognition," in Proc. Eurospeech, 1997, pp. 99-102.
    • (1997) Proc. Eurospeech , pp. 99-102
    • Shinoda, K.1    Watanabe, T.2
  • 20
    • 33947650089 scopus 로고    scopus 로고
    • HMM state clustering based on efficient cross-validation
    • T. Shinozaki, "HMM state clustering based on efficient cross-validation," in Proc. ICASSP, 2006, pp. 1157-1160.
    • (2006) Proc. ICASSP , pp. 1157-1160
    • Shinozaki, T.1
  • 21
    • 33646806075 scopus 로고    scopus 로고
    • Adaptation of precision matrix models on large vocabulary continuous speech recognition
    • K. Sim and M. Gales, "Adaptation of precision matrix models on large vocabulary continuous speech recognition," in Proc. ICASSP, 2005, pp. 97-100.
    • (2005) Proc. ICASSP , pp. 97-100
    • Sim, K.1    Gales, M.2
  • 23
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816-824, 2007.
    • (2007) IEICE Trans. Inf. Syst., Vol. E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 25
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-based speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP, 2000, pp. 1315-1318.
    • (2000) Proc. ICASSP , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 26
    • 84966348891 scopus 로고    scopus 로고
    • An HMM-based speech synthesis system applied to English
    • Workshop, CD-ROM Proceeding
    • K. Tokuda, H. Zen, and A. Black, "An HMM-based speech synthesis system applied to English," in Proc. IEEE Speech Synth. Workshop, 2002, CD-ROM Proceeding.
    • (2002) Proc. IEEE Speech Synth
    • Tokuda, K.1    Zen, H.2    Black, A.3
  • 28
    • 80051617808 scopus 로고    scopus 로고
    • Speaker and noise factorisation on AURORA4 task
    • Y.Wang and M. Gales, "Speaker and noise factorisation on AURORA4 task," in Proc. ICASSP, 2011, pp. 4584-4587.
    • (2011) Proc. ICASSP , pp. 4584-4587
    • Wang, Y.1    Gales, M.2
  • 29
    • 84859768642 scopus 로고    scopus 로고
    • The EMIME Bilingual Database, Tech. Rep. EDI-INF-RR-1388
    • M. Wester, "The EMIME Bilingual Database," Univ. of Edinburgh, 2010, Tech. Rep. EDI-INF-RR-1388.
    • (2010) Univ. of Edinburgh
    • Wester, M.1
  • 30
    • 70450192740 scopus 로고    scopus 로고
    • State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
    • Y.Wu, Y. Nankaku, and K. Tokuda, "State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis," in Proc. Interspeech, 2009, pp. 528-531.
    • (2009) Proc. Interspeech , pp. 528-531
    • Wu, Y.1    Nankaku, Y.2    Tokuda, K.3
  • 31
    • 33846463597 scopus 로고    scopus 로고
    • Ph.D. dissertation, Tokyo Inst. of Technol., Yokohama, Japan
    • J. Yamagishi, "Average-voice-based speech synthesis," Ph.D. dissertation, Tokyo Inst. of Technol., Yokohama, Japan, 2006.
    • (2006) Average-voice-based Speech Synthesis
    • Yamagishi, J.1
  • 32
    • 78049403515 scopus 로고    scopus 로고
    • Simple methods for improving speakersimilarity of HMM-based speech synthesis
    • J. Yamagishi and S. King, "Simple methods for improving speakersimilarity of HMM-based speech synthesis," in Proc. ICASSP, 2010, pp. 4610-4613.
    • (2010) Proc. ICASSP , pp. 4610-4613
    • Yamagishi, J.1    King, S.2
  • 33
    • 4544291748 scopus 로고    scopus 로고
    • Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis
    • J. Yamagishi, M. Tachibana, T. Masuko, and T. Kobayashi, "Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis," in Proc. ICASSP, 2004, pp. 5-8.
    • (2004) Proc. ICASSP , pp. 5-8
    • Yamagishi, J.1    Tachibana, M.2    Masuko, T.3    Kobayashi, T.4
  • 35
    • 67650819492 scopus 로고    scopus 로고
    • The HTS2007' system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge
    • J. Yamagishi, H. Zen, Y.Wu, T. Toda, and K. Tokuda, "The HTS2007' system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge," in Proc. Blizzard Challenge Workshop, 2008.
    • (2008) Proc. Blizzard Challenge Workshop
    • Yamagishi, J.1    Zen, H.2    Wu, Y.3    Toda, T.4    Tokuda, K.5
  • 36
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis," in Proc. Eurospeech, 1999, pp. 2347-2350.
    • (1999) Proc. Eurospeech , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 37
    • 4544253619 scopus 로고    scopus 로고
    • Adaptive training using structured transforms
    • K. Yu and M. Gales, "Adaptive training using structured transforms," in Proc. ICASSP, 2004, pp. 317-320.
    • (2004) Proc. ICASSP , pp. 317-320
    • Yu, K.1    Gales, M.2
  • 38
    • 79955538498 scopus 로고    scopus 로고
    • Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis
    • K. Yu, H. Zen, F. Mairesse, and S. Young, "Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis," Speech Commun., vol. 53, no. 6, pp. 914-923, 2011.
    • (2011) Speech Commun , vol.53 , Issue.6 , pp. 914-923
    • Yu, K.1    Zen, H.2    Mairesse, F.3    Young, S.4
  • 39
    • 79959813917 scopus 로고    scopus 로고
    • Speaker and language adaptive training for HMM-based polyglot speech synthesis
    • H. Zen, "Speaker and language adaptive training for HMM-based polyglot speech synthesis," in Proc. Interspeech, 2010, pp. 410-413.
    • (2010) Proc. Interspeech , pp. 410-413
    • Zen, H.1
  • 42
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis," Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.