메뉴 건너뛰기




Volumn 18, Issue 5, 2010, Pages 984-1004

Thousands of voices for HMM-based speech synthesis - Analysis and application of TTS systems fuilt on various ASR corpora

Author keywords

Automatic speech recognition (ASR); Average voice; H Triple S (HTS); Hidden Markov model (HMM) based speech synthesis; Speaker adaptation; Speech synthesis; SPEECON database; Voice conversion; WSJ database

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; AUTOMATIC SPEECH RECOGNITION (ASR); AVERAGE VOICE; SPEAKER ADAPTATION; VOICE CONVERSION;

EID: 77953708096     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2010.2045237     Document Type: Article
Times cited : (74)

References (70)
  • 1
    • 70450161300 scopus 로고    scopus 로고
    • Thousands of voices for HMM-based speech synthesis
    • Brighton, U.K., Sep.
    • J. Yamagishi et al., "Thousands of voices for HMM-based speech synthesis," in Proc. Interspeech-99, Brighton, U.K., Sep. 2009, pp. 420-423.
    • (2009) Proc. Interspeech-99 , pp. 420-423
    • Yamagishi, J.1
  • 2
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis
    • Budapest, Hungary, Sep.
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis," in Proc. EUROSPEECH-99, Budapest, Hungary, Sep. 1999, pp. 2374-12350
    • (1999) Proc. EUROSPEECH-99 , pp. 2374-12350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 6
    • 84867223798 scopus 로고    scopus 로고
    • Robustness of HMM-based speech synthesis
    • Brisbane, Australia, Sep.
    • J. Yamagishi, Z.-H. Ling, and S. King, "Robustness of HMM-based speech synthesis," in Proc. Interspeech-08, Brisbane, Australia, Sep. 2008, pp. 581-584.
    • (2008) Proc. Interspeech-08 , pp. 581-584
    • Yamagishi, J.1    Ling, Z.-H.2    King, S.3
  • 7
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Nov.
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis," Speech Commun., vol.51, no.11, pp. 1039-1064, Nov. 2009.
    • (2009) Speech Commun , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 8
    • 0012330750 scopus 로고
    • The design for thewall street journal-based CSR corpus
    • Harriman, NY
    • D. B. Paul and J. M. Baker, "The design for thewall street journal-based CSR corpus," in Proc.Workshop Speech Natural Lang., Harriman, NY, 1992, pp. 357-362.
    • (1992) Proc.Workshop Speech Natural Lang. , pp. 357-362
    • Paul, D.B.1    Baker, J.M.2
  • 9
    • 0028996854 scopus 로고
    • WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition
    • Detroit, MI, May
    • T. Robinson, J. Fransen, D. Pye, J. Foote, and S. Renals, "WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition," in Proc. ICASSP-95, Detroit, MI, May 1995, pp. 81-84.
    • (1995) Proc. ICASSP-95 , pp. 81-84
    • Robinson, T.1    Fransen, J.2    Pye, D.3    Foote, J.4    Renals, S.5
  • 11
    • 85009274666 scopus 로고    scopus 로고
    • GlobalPhone: A multilingual speech and text database developed at Karlsruhe university
    • Denver, CO, Sep.
    • T. Schultz, "GlobalPhone: A multilingual speech and text database developed at Karlsruhe university," in Proc. ICSLP'02, Denver, CO, Sep. 2002, pp. 345-348.
    • (2002) Proc. ICSLP'02 , pp. 345-348
    • Schultz, T.1
  • 12
    • 84910032186 scopus 로고    scopus 로고
    • SPEECON-speech databases for consumer devices: Database specification and validation
    • Canary Islands, Spain, May
    • D. Iskra, B. Grosskopf, K. Marasek, H. V. D. Heuvel, F. Diehl, and A. Kiessling, "SPEECON-speech databases for consumer devices: Database specification and validation," in Proc. LREC'02, Canary Islands, Spain, May 2002, pp. 329-333.
    • (2002) Proc. LREC'02 , pp. 329-333
    • Iskra, D.1    Grosskopf, B.2    Marasek, K.3    Heuvel, H.V.D.4    Diehl, F.5    Kiessling, A.6
  • 13
    • 70349227947 scopus 로고    scopus 로고
    • The application of hidden Markov models in speech recognition
    • M. J. F. Gales and S. J. Young, "The application of hidden Markov models in speech recognition," Foundations Trends R Signal Process., vol.1, no.3, pp. 195-304, 2008.
    • (2008) Foundations Trends R Signal Process , vol.1 , Issue.3 , pp. 195-304
    • Gales, M.J.F.1    Young, S.J.2
  • 14
    • 85128361526 scopus 로고    scopus 로고
    • The design of the newspaper- based Japanese large vocabulary continuous speech recognition corpus
    • Sydney, Australia, Dec.
    • K. Itou, M. Yamamoto, K. Takeda, T. Takezawa, T. Matsuoka, T. Kobayashi, K. Shikano, and S. Itahashi, "The design of the newspaper- based Japanese large vocabulary continuous speech recognition corpus," in Proc. ICSLP-98, Sydney, Australia, Dec. 1998, pp. 3261-3264.
    • (1998) Proc. ICSLP-98 , pp. 3261-3264
    • Itou, K.1    Yamamoto, M.2    Takeda, K.3    Takezawa, T.4    Matsuoka, T.5    Kobayashi, T.6    Shikano, K.7    Itahashi, S.8
  • 15
    • 0002985991 scopus 로고
    • Mora and syllable
    • N. Tsujimura, Ed. New York: Blackwell
    • H. Kubozono, "Mora and syllable," in The Handbook of Japanese Linguistics, N. Tsujimura, Ed. New York: Blackwell, 1995, pp. 31-61.
    • (1995) The Handbook of Japanese Linguistics , pp. 31-61
    • Kubozono, H.1
  • 16
    • 85030493378 scopus 로고    scopus 로고
    • Synthesis of regional English using a keyword lexicon
    • Budapest, Hungary, Sep.
    • S. Fitt and S. Isard, "Synthesis of regional English using a keyword lexicon," in Proc. Eurospeech-99, Budapest, Hungary, Sep. 1999, vol.2, pp. 823-826.
    • (1999) Proc. Eurospeech-99 , vol.2 , pp. 823-826
    • Fitt, S.1    Isard, S.2
  • 17
    • 34047123652 scopus 로고    scopus 로고
    • Multisyn: Open-domain unit selection for the Festival speech synthesis system
    • R. A. J. Clark, K. Richmond, and S. King, "Multisyn: Open-domain unit selection for the Festival speech synthesis system," Speech Commun., vol.49, no.4, pp. 317-330, 2007.
    • (2007) Speech Commun , vol.49 , Issue.4 , pp. 317-330
    • Clark, R.A.J.1    Richmond, K.2    King, S.3
  • 18
    • 77953725740 scopus 로고    scopus 로고
    • [Online].Available:
    • [Online]. Available: http://www.lc-star.com
  • 19
    • 77249139677 scopus 로고    scopus 로고
    • An HMM-based Mandarin Chinese text-to-speech system
    • Singapore, Dec.
    • Y. Qian, F. Soong, Y. Chen, and M. Chu, "An HMM-based Mandarin Chinese text-to-speech system," in Proc. ISCSLP'06, Singapore, Dec. 2006, pp. 223-232.
    • (2006) Proc. ISCSLP'06 , pp. 223-232
    • Qian, Y.1    Soong, F.2    Chen, Y.3    Chu, M.4
  • 20
    • 77953713775 scopus 로고    scopus 로고
    • Deliverable Report D2.1 EMIME Project, 2008
    • Deliverable Report D2.1 EMIME Project, 2008.
  • 21
  • 22
    • 85123861026 scopus 로고    scopus 로고
    • XIMERA: A new TTS from ATR based on corpus-based technologies
    • Workshop, Pittsburgh, PA, Jun.
    • H. Kawai, T. Toda, J. Ni, M. Tsuzaki, and K. Tokuda, "XIMERA: A new TTS from ATR based on corpus-based technologies," in Proc. ISCA 5th Speech Synth. Workshop, Pittsburgh, PA, Jun. 2004, pp. 179-184.
    • (2004) Proc. ISCA 5th Speech Synth , pp. 179-184
    • Kawai, H.1    Toda, T.2    Ni, J.3    Tsuzaki, M.4    Tokuda, K.5
  • 25
    • 77949915957 scopus 로고    scopus 로고
    • Generacion de una voz sintetica en Castellano basada en HSMM para la Evaluacion Albayzin 2008: Conversion texto a voz
    • Bilbao, Spain, Nov. [Online]. Available:
    • R. Barra-Chicote, J. Yamagishi, J. Montero, S. King, S. Lutfi, and J. Macias-Guarasa, "Generacion de una voz sintetica en Castellano basada en HSMM para la Evaluacion Albayzin 2008: Conversion texto a voz," in V Jornadas en Tecnologia del Habla (in Spanish), Bilbao, Spain, Nov. 2008, pp. 115-118 [Online]. Available: http://www.cstr.inf.ed.ac.uk/downloads/ publications/ 2008/tts-jth08.pdf
    • (2008) V Jornadas en Tecnologia Del Habla (In Spanish) , pp. 115-118
    • Barra-Chicote, R.1    Yamagishi, J.2    Montero, J.3    King, S.4    Lutfi, S.5    MacIas-Guarasa, J.6
  • 26
    • 33645758767 scopus 로고    scopus 로고
    • HMM-based approach to multilingual speech synthesis
    • S. Narayanan and A. Alwan, Eds. Upper Saddle River, NJ: Prentice-Hall
    • K. Tokuda, H. Zen, and A. W. Black, "HMM-based approach to multilingual speech synthesis," in Text to Speech Synthesis: New Paradigms and Advances, S. Narayanan and A. Alwan, Eds. Upper Saddle River, NJ: Prentice-Hall, 2004.
    • (2004) Text to Speech Synthesis: New Paradigms and Advances
    • Tokuda, K.1    Zen, H.2    Black, A.W.3
  • 27
    • 0002144369 scopus 로고
    • Tree-based state tying for high accuracy acoustic modeling
    • Workshop, Plainsboro, NJ, Mar.
    • S. J. Young, J. J. Odell, and P. C. Woodland, "Tree-based state tying for high accuracy acoustic modeling," in Proc. ARPA Human Lang. Technol. Workshop, Plainsboro, NJ, Mar. 1994, pp. 307-312.
    • (1994) Proc. ARPA Human Lang. Technol , pp. 307-312
    • Young, S.J.1    Odell, J.J.2    Woodland, P.C.3
  • 28
    • 70449126171 scopus 로고    scopus 로고
    • The HTS- 2008 system: Yet another evaluation of the speaker-adaptive HMMbased speech synthesis system in the 2008 Blizzard Challenge
    • Brisbane, Australia, Sep.
    • J. Yamagishi, H. Zen, Y.-J. Wu, T. Toda, and K. Tokuda, "The HTS- 2008 system: Yet another evaluation of the speaker-adaptive HMMbased speech synthesis system in the 2008 Blizzard Challenge," in Proc. Blizzard Challenge 2008, Brisbane, Australia, Sep. 2008.
    • (2008) Proc. Blizzard Challenge 2008
    • Yamagishi, J.1    Zen, H.2    Wu, Y.-J.3    Toda, T.4    Tokuda, K.5
  • 31
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol.27, pp. 187-207, 1999.
    • (1999) Speech Commun , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigné, A.3
  • 32
    • 33846405723 scopus 로고    scopus 로고
    • Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005
    • Jan.
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005," IEICE Trans. Inf. Syst., vol.E90-D, no.1, pp. 325-333, Jan. 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.1 , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 33
    • 44449177634 scopus 로고    scopus 로고
    • A hidden semi-Markov model-based speech synthesis system
    • May
    • H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markov model-based speech synthesis system," IEICE Trans. Inf. Syst., vol.E90-D, no.5, pp. 825-834, May 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 825-834
    • Zen, H.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 34
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the em algorithm
    • A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., Series B, vol.39, no.1, pp. 1-38, 1977.
    • (1977) J. R. Statist. Soc., Series B , vol.39 , Issue.1 , pp. 1-38
    • Dempster, A.1    Laird, N.2    Rubin, D.3
  • 35
    • 0033906251 scopus 로고    scopus 로고
    • MDL-based context-dependent subword modeling for speech recognition
    • Mar.
    • K. Shinoda and T.Watanabe, "MDL-based context-dependent subword modeling for speech recognition," J. Acoust. Soc. Japan (E), vol.21, pp. 79-86, Mar. 2000.
    • (2000) J. Acoust. Soc. Japan (E) , vol.21 , pp. 79-86
    • Shinoda, K.1    Watanabe, T.2
  • 36
    • 77953719894 scopus 로고    scopus 로고
    • Evaluation of flat start labeling for phoneme based Mandarin HTS system
    • Aug.
    • Y. Guan and J. Tian, "Evaluation of flat start labeling for phoneme based Mandarin HTS system," in Proc. ORIENTAL-COCOSDA-09, Aug. 2009, pp. 187-190.
    • (2009) Proc. ORIENTAL-COCOSDA-09 , pp. 187-190
    • Guan, Y.1    Tian, J.2
  • 37
    • 0030362995 scopus 로고    scopus 로고
    • A compact model for speaker-adaptive training
    • Philadelphia, PA, Oct.
    • T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. ICSLP-96, Philadelphia, PA, Oct. 1996, pp. 1137-1140.
    • (1996) Proc. ICSLP-96 , pp. 1137-1140
    • Anastasakos, T.1    McDonough, J.2    Schwartz, R.3    Makhoul, J.4
  • 38
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMMbased speech recognition
    • M. J. F. Gales, "Maximum likelihood linear transformations for HMMbased speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998.
    • (1998) Comput. Speech Lang. , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 39
    • 67650854725 scopus 로고    scopus 로고
    • Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
    • Jan.
    • J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Trans. Speech, Audio, Lang. Process., vol. 17, no. 1, pp. 66-83, Jan. 2009.
    • (2009) IEEE Trans. Speech, Audio, Lang. Process. , vol.17 , Issue.1 , pp. 66-83
    • Yamagishi, J.1    Kobayashi, T.2    Nakano, Y.3    Ogata, K.4    Isogai, J.5
  • 40
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • May
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol.E90-D, no.5, pp. 816-824, May 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 41
    • 0025543906 scopus 로고
    • Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • E. Moulines and F. Charpentier, "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones," Speech Commun., vol.9, no.5-6, pp. 453-468, 1990.
    • (1990) Speech Commun , vol.9 , Issue.5-6 , pp. 453-468
    • Moulines, E.1    Charpentier, F.2
  • 42
    • 85016140477 scopus 로고
    • An adaptive algorithm for mel-cepstral analysis of speech
    • San Francisco, CA, Mar.
    • T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for mel-cepstral analysis of speech," in Proc. ICASSP-92, San Francisco, CA, Mar. 1992, pp. 137-140.
    • (1992) Proc. ICASSP-92 , pp. 137-140
    • Fukada, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 44
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
    • Feb.
    • J.Yamagishi and T.Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans. Inf. Syst., vol.E90-D, no.2, pp. 533-543, Feb. 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 45
    • 70450183638 scopus 로고    scopus 로고
    • Measuring the gap between HMM-based ASR and TTS
    • Brighton, U.K., Sep.
    • J. Dines, J. Yamagishi, and S. King, "Measuring the gap between HMM-based ASR and TTS," in Proc. Interspeech-09, Brighton, U.K., Sep. 2009, pp. 1391-1394.
    • (2009) Proc. Interspeech-09 , pp. 1391-1394
    • Dines, J.1    Yamagishi, J.2    King, S.3
  • 48
    • 60849092922 scopus 로고    scopus 로고
    • Cross-lingual speaker adaptation for HMM-based speech synthesis
    • Kunming, China
    • Y.-J. Wu, S. King, and K. Tokuda, "Cross-lingual speaker adaptation for HMM-based speech synthesis," in Proc. ISCSLP-08, Kunming, China, 2008, pp. 9-12.
    • (2008) Proc. ISCSLP-08 , pp. 9-12
    • Wu, Y.-J.1    King, S.2    Tokuda, K.3
  • 49
    • 70450192740 scopus 로고    scopus 로고
    • State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
    • Brighton, U.K., Sep.
    • Y.-J. Wu and K. Tokuda, "State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis," in Proc. Interspeech- 09, Brighton, U.K., Sep. 2009, pp. 528-531.
    • (2009) Proc. Interspeech- 09 , pp. 528-531
    • Wu, Y.-J.1    Tokuda, K.2
  • 50
  • 51
    • 0019146354 scopus 로고
    • Correlation analysis of subjective and objective measures for speech quality
    • Denver, CO
    • T. P. Barnwell, III, "Correlation analysis of subjective and objective measures for speech quality," in Proc. ICASSP-80, Denver, CO, 1980, pp. 706-709.
    • (1980) Proc. ICASSP-80 , pp. 706-709
    • Barnwell III, T.P.1
  • 52
    • 0029725605 scopus 로고    scopus 로고
    • Speech synthesis using HMMs with dynamic features
    • Atlanta, GA, May
    • T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Speech synthesis using HMMs with dynamic features," in Proc. ICASSP-96, Atlanta, GA, May 1996, pp. 389-392.
    • (1996) Proc. ICASSP-96 , pp. 389-392
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 53
    • 70349208664 scopus 로고    scopus 로고
    • Optimizing segment label boundaries for statistical speech synthesis
    • Taipei, Taiwan, Apr.
    • A. W. Black and J. Kominek, "Optimizing segment label boundaries for statistical speech synthesis," in Proc. ICASSP-09, Taipei, Taiwan, Apr. 2009, pp. 3785-3788.
    • (2009) Proc. ICASSP-09 , pp. 3785-3788
    • Black, A.W.1    Kominek, J.2
  • 55
    • 33646800617 scopus 로고    scopus 로고
    • Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models
    • Jeju Island, Korea, Oct.
    • M. Shozakai and G. Nagino, "Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models," in Proc. ICSLP-04, Jeju Island, Korea, Oct. 2004, pp. 717-720.
    • (2004) Proc. ICSLP-04 , pp. 717-720
    • Shozakai, M.1    Nagino, G.2
  • 58
    • 33646781551 scopus 로고    scopus 로고
    • Acoustic training from heterogeneous data sources: Experiments in Mandarin conversational telephone speech transcription
    • S. Tsakalidis and W. Byrne, "Acoustic training from heterogeneous data sources: Experiments in Mandarin conversational telephone speech transcription," in Proc. ICASSP-05, 18-23, 2005, vol.1, pp. 461-464.
    • (2005) Proc. ICASSP-05, 18-23 , vol.1 , pp. 461-464
    • Tsakalidis, S.1    Byrne, W.2
  • 59
    • 77953712724 scopus 로고    scopus 로고
    • Cross-corpus normalization of diverse acoustic training data for robustHMMtraining
    • Cambridge, U.K.
    • S. Tsakalidis and W. Byrne, "Cross-corpus normalization of diverse acoustic training data for robustHMMtraining," Cambridge Univ. Eng. Dept., Cambridge, U.K., 2005.
    • (2005) Cambridge Univ. Eng. Dept.
    • Tsakalidis, S.1    Byrne, W.2
  • 62
    • 84970205467 scopus 로고
    • Attractive faces are only average
    • J. H. Langlois and L. A. Roggman, "Attractive faces are only average," Psychol. Sci., vol.1, no.2, pp. 115-121, 1990.
    • (1990) Psychol. Sci. , vol.1 , Issue.2 , pp. 115-121
    • Langlois, J.H.1    Roggman, L.A.2
  • 63
    • 77953710433 scopus 로고    scopus 로고
    • Analysis of unsupervised and noise-robust speaker-adaptive HMM-based speech synthesis systems toward a unified ASR and TTS framework
    • Edinburgh, U.K., Sep.
    • J. Yamagishi, M. Lincoln, S. King, J. Dines, M. Gibson, J. Tian, and Y. Guan, "Analysis of unsupervised and noise-robust speaker-adaptive HMM-based speech synthesis systems toward a unified ASR and TTS framework," in Proc. Blizzard Challenge Workshop, Edinburgh, U.K., Sep. 2009.
    • (2009) Proc. Blizzard Challenge Workshop
    • Yamagishi, J.1    Lincoln, M.2    King, S.3    Dines, J.4    Gibson, M.5    Tian, J.6    Guan, Y.7
  • 64
    • 85131821539 scopus 로고
    • Mel-generalized cepstral analysis-A unified approach to speech spectral estimation
    • Yokohama, Japan, Sep.
    • K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai, "Mel-generalized cepstral analysis-A unified approach to speech spectral estimation," in Proc. ICSLP-94, Yokohama, Japan, Sep. 1994, pp. 1043-1046.
    • (1994) Proc. ICSLP-94 , pp. 1043-1046
    • Tokuda, K.1    Kobayashi, T.2    Masuko, T.3    Imai, S.4
  • 66
    • 0036567794 scopus 로고    scopus 로고
    • The development of the HTK broadcast news transcription system: An overview
    • P. C. Woodland, "The development of the HTK broadcast news transcription system: An overview," Speech Commun., vol.37, no.1-2, pp. 47-67, 2002.
    • (2002) Speech Commun , vol.37 , Issue.1-2 , pp. 47-67
    • Woodland, P.C.1
  • 67
    • 77953693885 scopus 로고    scopus 로고
    • Building personalised synthesised voices for individuals with dysarthria using the HTS toolkit
    • J. W. Mullennix and S. E. Stern, Eds. Hershey, PA: IGI Global, Jan.
    • S. Creer, P. Green, S. Cunningham, and J. Yamagishi, "Building personalised synthesised voices for individuals with dysarthria using the HTS toolkit," in Computer Synthesized Speech Technologies: Tools for Aiding Impairment, J. W. Mullennix and S. E. Stern, Eds. Hershey, PA: IGI Global, Jan. 2010.
    • (2010) Computer Synthesized Speech Technologies: Tools for Aiding Impairment
    • Creer, S.1    Green, P.2    Cunningham, S.3    Yamagishi, J.4
  • 68
    • 85135274466 scopus 로고    scopus 로고
    • On the security of HMM-based speaker verification systems against imposture using synthetic speech
    • Budapest, Hungary, Sep.
    • T. Masuko, T. Hitotsumatsu, K. Tokuda, and T. Kobayashi, "On the security of HMM-based speaker verification systems against imposture using synthetic speech," in Proc. Eurospeech-99, Budapest, Hungary, Sep. 1999, pp. 1223-1226.
    • (1999) Proc. Eurospeech-99 , pp. 1223-1226
    • Masuko, T.1    Hitotsumatsu, T.2    Tokuda, K.3    Kobayashi, T.4
  • 69
    • 85009077529 scopus 로고    scopus 로고
    • Imposture using synthetic speech against speaker verification based on spectrum and pitch
    • Beijing, China, Oct.
    • T. Masuko, K. Tokuda, and T. Kobayashi, "Imposture using synthetic speech against speaker verification based on spectrum and pitch," in Proc. ICSLP-00, Beijing, China, Oct. 2000, pp. 302-305.
    • (2000) Proc. ICSLP-00 , pp. 302-305
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3
  • 70
    • 78049409687 scopus 로고    scopus 로고
    • Revisiting the security of speaker verification systems against imposture using synthetic speech
    • Dallas, TX, Mar.
    • P. L. De Leon, V. R. Apsingekar, M. Pucher, and J. Yamagishi, "Revisiting the security of speaker verification systems against imposture using synthetic speech," in Proc. ICASSP-10, Dallas, TX, Mar. 2010.
    • (2010) Proc. ICASSP-10
    • De Leon, P.L.1    Apsingekar, V.R.2    Pucher, M.3    Yamagishi, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.