메뉴 건너뛰기




Volumn 8, Issue 2, 2014, Pages 285-295

Noise in HMM-based speech synthesis adaptation: Analysis, evaluation methods and experiments

Author keywords

Adaptation; evaluation methods; noise robustness; speech synthesis

Indexed keywords

ADAPTATION; ENVIRONMENTAL NOISE; EVALUATION METHODS; HMM-BASED SPEECH SYNTHESIS; INVESTIGATE EFFECTS; NOISE ROBUSTNESS; PERSONALIZED VOICE; SYNTHESIZED SPEECH;

EID: 84897869648     PISSN: 19324553     EISSN: None     Source Type: Journal    
DOI: 10.1109/JSTSP.2013.2278492     Document Type: Article
Times cited : (16)

References (39)
  • 1
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis" Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009
    • (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, W.A.3
  • 3
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
    • DOI 10.1093/ietisy/e90-d.2.533
    • J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training" IEICE Trans. Inf. Syst., vol. E90-D, no. 2, pp. 533-543, 2007 (Pubitemid 46279829)
    • (2007) IEICE Transactions on Information and Systems , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 4
    • 67650803663 scopus 로고    scopus 로고
    • Combining statistical parametric speech synthesis and unit-selection for automatic voice cloning
    • M. Aylett and J. Yamagishi, "Combining statistical parametric speech synthesis and unit-selection for automatic voice cloning" in Proc. LangTech, 2008
    • (2008) Proc. LangTech
    • Aylett, M.1    Yamagishi, J.2
  • 6
    • 84890528712 scopus 로고    scopus 로고
    • HMM-based speech synthesis adaptation using noisy data: Analysis and evaluation methods
    • R. Karhila, U. Remes, andM.Kurimo, "HMM-based speech synthesis adaptation using noisy data: Analysis and evaluation methods" in Proc. ICASSP, 2013
    • (2013) Proc. ICASSP
    • Karhila, R.1    Remes, U.2    Kurimo, M.3
  • 7
    • 67650854725 scopus 로고    scopus 로고
    • Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
    • J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm" IEEE Trans. Audio, Speech, Lang. Process, vol. 17, no. 1, pp. 66-83, 2009
    • (2009) IEEE Trans. Audio, Speech, Lang. Process , vol.17 , Issue.1 , pp. 66-83
    • Yamagishi, J.1    Kobayashi, T.2    Nakano, Y.3    Ogata, K.4    Isogai, J.5
  • 8
    • 85009097035 scopus 로고    scopus 로고
    • Fast speaker adaptation using eigenspace-based maximum likelihood linear regression
    • K.-T. Chen, W.-W. Liau, H.-M. Wang, and L.-S. Lee, "Fast speaker adaptation using eigenspace-based maximum likelihood linear regression" in Proc. ICSLP, 2000
    • (2000) Proc. ICSLP
    • Chen, K.-T.1    Liau, W.-W.2    Wang, H.-M.3    Lee, L.-S.4
  • 10
    • 80051636048 scopus 로고    scopus 로고
    • Speaker similarity evaluation of foreignaccented speech synthesis using HMM-based speaker adaptation
    • M. Wester and R. Karhila, "Speaker similarity evaluation of foreignaccented speech synthesis using HMM-based speaker adaptation" in Proc. ICASSP, 2011
    • (2011) Proc. ICASSP
    • Wester, M.1    Karhila, R.2
  • 11
    • 79959818117 scopus 로고    scopus 로고
    • Non-negative matrix factorization based compensation of music for automatic speech recognition
    • B. Raj, T. Virtanen, S. Chaudhuri, and R. Singh, "Non-negative matrix factorization based compensation of music for automatic speech recognition" in Proc. Interspeech, 2010
    • (2010) Proc. Interspeech
    • Raj, B.1    Virtanen, T.2    Chaudhuri, S.3    Singh, R.4
  • 12
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M. J. F. Gales, "Maximum likelihood linear transformations for HMMbased speech recognition" Comput. Speech Lang., vol. 12, pp. 75-98, 1998 (Pubitemid 128383747)
    • (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 13
    • 84865763570 scopus 로고    scopus 로고
    • Rapid adaptation of foreign-accented HMM-based speech synthesis
    • R. Karhila and M. Wester, "Rapid adaptation of foreign-accented HMM-based speech synthesis" in Proc. Interspeech, 2011
    • (2011) Proc. Interspeech
    • Karhila, R.1    Wester, M.2
  • 17
    • 0034227757 scopus 로고    scopus 로고
    • Cluster adaptive training of hidden Markov models
    • Jul
    • M. J. F. Gales, "Cluster adaptive training of hidden Markov models" IEEE Trans. Speech Audio Process., vol. 8, no. 4, pp. 417-428, Jul. 2000
    • (2000) IEEE Trans. Speech Audio Process , vol.8 , Issue.4 , pp. 417-428
    • Gales, M.J.F.1
  • 21
    • 34047246852 scopus 로고    scopus 로고
    • Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting
    • Jul
    • B.K.-W. Mak,R. W.-H.Hsiao, S.K.-L. Ho, and J. Kwok, "Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting" IEEE Trans. Audio, Speech, Lang. Process, vol. 14, no. 4, pp. 1267-1280, Jul. 2006
    • (2006) IEEE Trans. Audio, Speech, Lang. Process , vol.14 , Issue.4 , pp. 1267-1280
    • Makr, W.-H.1    Hsiao, B.K.-W.2    Ho, S.K.-L.3    Kwok, J.4
  • 22
    • 84897910241 scopus 로고    scopus 로고
    • Kernel eigenvoices (revisited) for largevocabulary speech recognition
    • Dec
    • Z. Roupakia and M. Gales, "Kernel eigenvoices (revisited) for largevocabulary speech recognition" IEEE Signal Process. Lett., vol. 18, no. 12, pp. 709-712, Dec. 2011
    • (2011) IEEE Signal Process. Lett , vol.18 , Issue.12 , pp. 709-712
    • Roupakia, Z.1    Gales, M.2
  • 23
    • 56149122221 scopus 로고    scopus 로고
    • Kernel eigenspace-based MLLR adaptation
    • Mar
    • B. Mak and R. Hsiao, "Kernel eigenspace-based MLLR adaptation" IEEE Trans. Audio, Speech, Lang. Process, vol. 15, no. 3, pp. 784-795, Mar. 2007
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.3 , pp. 784-795
    • Mak, B.1    Hsiao, R.2
  • 27
    • 44149106061 scopus 로고    scopus 로고
    • Evaluation of objective quality measures for speech enhancement
    • Jan
    • Y. Hu and P. Loizou, "Evaluation of objective quality measures for speech enhancement" IEEE Trans. Audio, Speech, Lang. Process, vol. 16, no. 1, pp. 229-238, Jan. 2008
    • (2008) IEEE Trans. Audio, Speech, Lang. Process , vol.16 , Issue.1 , pp. 229-238
    • Hu, Y.1    Loizou, P.2
  • 30
    • 80051651104 scopus 로고    scopus 로고
    • Univ. of Edinburgh, Edinburgh, U.K, Tech. Rep. EDI-INF-RR-1388
    • M. Wester, "The EMIME Bilingual Database" Univ. of Edinburgh, Edinburgh, U.K., 2010, Tech. Rep. EDI-INF-RR-1388
    • (2010) The EMIME Bilingual Database
    • Wester, M.1
  • 31
    • 0027623210 scopus 로고
    • Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems
    • A. Varga and H. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems" Speech Commun., vol. 12, no. 3, pp. 247-251, 1993
    • (1993) Speech Commun , vol.12 , Issue.3 , pp. 247-251
    • Varga, A.1    Steeneken, H.2
  • 33
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveign, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds" Speech Commun., vol. 27, pp. 187-207, 1999
    • (1999) Speech Commun , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveign, A.3
  • 34
    • 11144317887 scopus 로고    scopus 로고
    • 0 estimation of speech signal using harmonicity measure based on instantaneous frequency
    • D. Arifianto, T. Tanaka, T. Masuko, and T. Kobayashi, "Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency" IEICE Trans. Inf. Syst., vol. 87, no. 12, pp. 2812-2820, 2004 (Pubitemid 40021353)
    • (2004) IEICE Transactions on Information and Systems , vol.E87-D , Issue.12 , pp. 2812-2820
    • Arifianto, D.1    Tanaka, T.2    Masuko, T.3    Kobayashi, T.4
  • 35
    • 84928118106 scopus 로고    scopus 로고
    • Fixed point analysis of frequency to instantaneous frequencymapping for accurate estimation of F0 and periodicity
    • H. Kawahara, H. Katayose, A. de Cheveign, and R. D. Patterson, "Fixed point analysis of frequency to instantaneous frequencymapping for accurate estimation of F0 and periodicity" in Proc. Eurospeech, 1999, pp. 2781-2784
    • (1999) Proc. Eurospeech , pp. 2781-2784
    • Kawahara, H.1    Katayose, H.2    De Cheveign, A.3    Patterson, R.D.4
  • 36
    • 0001455934 scopus 로고
    • A robust algorithm for pitch tracking (RAPT)
    • D. Talkin, "A robust algorithm for pitch tracking (RAPT)" Speech Coding Synth., pp. 495-518, 1995
    • (1995) Speech Coding Synth , pp. 495-518
    • Talkin, D.1
  • 39
    • 84897855414 scopus 로고    scopus 로고
    • Objective evaluation measures for speaker-adaptive HMM-TTS systems
    • U. Remes, R. Karhila, and M. Kurimo, "Objective evaluation measures for speaker-adaptive HMM-TTS systems" in Proc. SSW.
    • Proc. SSW
    • Remes, U.1    Karhila, R.2    Kurimo, M.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.