SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 19, Issue 4, 2011, Pages 895-904

Unsupervised intralingual and cross-lingual speaker adaptation for HMM-Based speech synthesis using two-pass decision tree construction

(2) Gibson, Matthew a Byrne, William a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

Cross lingual; hidden Markov model (HMM) based speech synthesis; unsupervised speaker adaptation

Indexed keywords

ACOUSTIC MODEL; AUTOMATIC SPEECH RECOGNITION; CROSS-LINGUAL; DECISION TREE CONSTRUCTION; HMM-BASED SPEECH SYNTHESIS; LINGUISTIC ANALYSIS; SPEAKER ADAPTATION; SPEECH SYNTHESIS SYSTEM; SYNTHESIS MODELS; TRAINING DATASET; UNSUPERVISED ADAPTATION; UNSUPERVISED SPEAKER ADAPTATION;

DECISION TREES; HIDDEN MARKOV MODELS; SPEECH SYNTHESIS; TRANSCRIPTION;

SPEECH RECOGNITION;

EID: 79953289255 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2066968 Document Type: Article

Times cited : (7)

References (27)

1
- 67650790758
- The blizzard challenge 2008
- V. Karaiskos, S. King, R. Clark, and C. Mayo, "The blizzard challenge 2008," in Proc. Blizzard, 2008, 2008.
- (2008) Proc. Blizzard , vol.2008
- Karaiskos, V.¹ King, S.² Clark, R.³ Mayo, C.⁴

2
- 0034230270
- Speaker interpolation for HMM-based speech synthesis system
- T. Yoshimura, T. Masuko, K. Tokuda, T. Kobayashi, and T. Kitamura, "Speaker interpolation for HMM-based speech synthesis system," J. Acoust. Soc. Japan, vol. 21, no. 4, pp. 119-206, 2000.
- (2000) J. Acoust. Soc. Japan , vol.21 , Issue.4 , pp. 119-206
- Yoshimura, T.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴ Kitamura, T.⁵

3
- 24144497811
- Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis
- J.Yamagishi, K. Onishi, T. Masuko, and T.Kobayashi, "Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol. E88-D, no. 3, pp. 503-509, 2005.
- (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 503-509
- Yamagishi, J.¹ Onishi, K.² Masuko, T.³ Kobayashi, T.⁴

4
- 67650854725
- Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
- Jan.
- J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Audio, Speech, Lang. Process., vol. 17, no. 1, pp. 66-83, Jan. 2009.
- (2009) IEEE Audio, Speech, Lang. Process. , vol.17 , Issue.1 , pp. 66-83
- Yamagishi, J.¹ Kobayashi, T.² Nakano, Y.³ Ogata, K.⁴ Isogai, J.⁵

5
- 84867203039
- Unsupervised adaptation for HMM-based speech synthesis
- S. King, K. Tokuda, H. Zen, and J. Yamagishi, "Unsupervised adaptation for HMM-based speech synthesis," in Proc. Interspeech, 2008, pp. 1869-1872.
- (2008) Proc. Interspeech , pp. 1869-1872
- King, S.¹ Tokuda, K.² Zen, H.³ Yamagishi, J.⁴

6
- 84856280064
- An evaluation of cross-language adaptation for rapid HMM development in a new language
- B. Wheatley, K. Kondo, W. Anderson, and Y. Muthusamy, "An evaluation of cross-language adaptation for rapid HMM development in a new language," in Proc. ICASSP, 1994, vol. 1, pp. 237-240.
- (1994) Proc. ICASSP , vol.1 , pp. 237-240
- Wheatley, B.¹ Kondo, K.² Anderson, W.³ Muthusamy, Y.⁴

7
- 0004659972
- MAP-based crosslanguage adaptation augmented by linguistic knowledge: From English to Chinese
- P. Fung, C. Y. Ma, and W. K. Liu, "MAP-based crosslanguage adaptation augmented by linguistic knowledge: From English to Chinese," in Proc. Eurospeech, 1999, pp. 871-874.
- (1999) Proc. Eurospeech , pp. 871-874
- Fung, P.¹ Ma, C.Y.² Liu, W.K.³

8
- 60849092922
- Cross-lingual speaker adaptation for HMM-based speech synthesis
- Y. Wu, S. King, and K. Tokuda, "Cross-lingual speaker adaptation for HMM-based speech synthesis," in Proc. ISCSLP, 2008, pp. 1-4.
- (2008) Proc. ISCSLP , pp. 1-4
- Wu, Y.¹ King, S.² Tokuda, K.³

9
- 70450192740
- State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
- Y.Wu, Y. Nankaku, and K. Tokuda, "State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis," in Proc. Interspeech, 2009, pp. 528-531.
- (2009) Proc. Interspeech , pp. 528-531
- Wu, Y.¹ Nankaku, Y.² Tokuda, K.³

10
- 70449126171
- The HTS- 2008 system: Yet another evaluation of the speaker-adaptive HMMbased speech synthesis system in the 2008 blizzard challenge
- J. Yamagishi, H. Zen, Y.-J. Wu, T. Toda, and T. Tokuda, "The HTS- 2008 system: Yet another evaluation of the speaker-adaptive HMMbased speech synthesis system in the 2008 blizzard challenge," in Proc. Blizzard, 2008, p.
- (2008) Proc. Blizzard
- Yamagishi, J.¹ Zen, H.² Wu, Y.-J.³ Toda, T.⁴ Tokuda, T.⁵

11
- 79953281905
- M.Phil. dissertation, Cambridge Univ., Cambridge, U.K.
- M. Gibson, "Efficient maximum likelihood linear regression," M.Phil. dissertation, Cambridge Univ., Cambridge, U.K., 2004.
- (2004) Efficient Maximum Likelihood Linear Regression
- Gibson, M.¹

12
- 70450169407
- Speech recognition with speech synthesis models by marginalising over decision tree leaves
- J. Dines, L. Saheer, and H. Liang, "Speech recognition with speech synthesis models by marginalising over decision tree leaves," in Proc. Interspeech, 2009, pp. 1395-1398.
- (2009) Proc. Interspeech , pp. 1395-1398
- Dines, J.¹ Saheer, L.² Liang, H.³

13
- 70450185735
- Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models
- M. Gibson, "Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models," in Proc. Interspeech, 2009, pp. 1791-1794.
- (2009) Proc. Interspeech , pp. 1791-1794
- Gibson, M.¹

14
- 33846405723
- Details of the nitech HMM-based speech synthesis system for the blizzard challenge 2005
- DOI 10.1093/ietisy/e90-1.1.325
- H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of nitech HMM-based speech synthesis system for the blizzard challenge 2005," IEICE Trans. Inf. Syst., vol. E90-D, no. 1, pp. 325-333, 2007. (Pubitemid 46145336)
- (2007) IEICE Transactions on Information and Systems , vol.E90-D , Issue.1 , pp. 325-333
- Zen, H.¹ Toda, T.² Nakamura, M.³ Tokuda, K.⁴

15
- 78049411002
- Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction
- M. Gibson, T. Hirsimaki, R. Karhila, M. Kurimo, and W. Byrne, "Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction," in Proc. ICASSP, 2010, pp. 4642-4645.
- (2010) Proc. ICASSP , pp. 4642-4645
- Gibson, M.¹ Hirsimaki, T.² Karhila, R.³ Kurimo, M.⁴ Byrne, W.⁵

16
- 0030362995
- A compact model for speaker adaptive training
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker adaptive training," in Proc. ICSLP, Philadelphia, 1996, pp. 1137-1140.
- (1996) Proc. ICSLP, Philadelphia , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

17
- 33846463597
- Ph.D. dissertation, Tokyo Inst. of Technol., Tokyo, Japan
- J. Yamagishi, "Average-voice-based speech synthesis," Ph.D. dissertation, Tokyo Inst. of Technol., Tokyo, Japan, 2006.
- (2006) Average-Voice-Based Speech Synthesis
- Yamagishi, J.¹

18
- 0032673049
- Restructuring speech representations using a pitch adaptive time-frequency smoothing and an instantaneous frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. Cheveigne, "Restructuring speech representations using a pitch adaptive time-frequency smoothing and an instantaneous frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, pp. 187-207, 1999.
- (1999) Speech Commun. , vol.27 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² Cheveigne, A.³

19
- 0036522887
- Multi-space probability distribution HMM
- K. Tokuda, T. Masuko, N. Miyazaki, and T. Kobayashi, "Multi-space probability distribution HMM," IEICE Trans. Inf. Syst., vol. E85-D, no. 3, pp. 455-464, 2002. (Pubitemid 35353984)
- (2002) IEICE Transactions on Information and Systems , vol.E85-D , Issue.3 , pp. 455-464
- Tokuda, K.¹ Masuko, T.² Miyazaki, N.³ Kobayashi, T.⁴

20
- 44449177634
- Hidden semi-Markov model based speech synthesis system
- H. Zen, K. Tokuda, T. Masuko, K. Kobayashi, and T. Kitamura, "Hidden semi-Markov model based speech synthesis system," IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 825-834, 2007.
- (2007) IEICE Trans. Inf. Syst., vol. E90-D , Issue.5 , pp. 825-834
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Kobayashi, K.⁴ Kitamura, T.⁵

21
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP, 2000, pp. 1315-1318.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

22
- 38549096029
- A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
- T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans., vol. E90-D, no. 5, pp. 816-824, 2007.
- (2007) IEICE Trans. , vol.E90-D , Issue.5 , pp. 816-824
- Tokuda T. Toda¹ Tokuda, K.²

23
- 70450161300
- Thousands of voices for HMM-based speech synthesis
- J. Yamagishi, B. Usabaev, S. King, O.Watts, J. Dines, J. Tian, R. Hu, Y. Guan, K. Oura, K. Tokuda, R. Karhila, and M. Kurimo, "Thousands of voices for HMM-based speech synthesis," in Proc. Interspeech, 2009, pp. 420-423.
- (2009) Proc. Interspeech , pp. 420-423
- Yamagishi, J.¹ Usabaev, B.² King, S.³ Watts, O.⁴ Dines, J.⁵ Tian, J.⁶ Hu, R.⁷ Guan, Y.⁸ Oura, K.⁹ Tokuda, K.¹⁰ Kurimo R. Karhila¹¹ Kurimo, M.¹²

24
- 0029375590
- Speaker adaptation using constrained estimation of Gaussian mixtures
- Sep.
- V. Digalakis, D. Rtischev, and L. Neumeyer, "Speaker adaptation using constrained estimation of Gaussian mixtures," IEEE Trans. Speech Audio Process., vol. 3, no. 5, pp. 357-366, Sep. 1995.
- (1995) IEEE Trans. Speech Audio Process. , vol.3 , Issue.5 , pp. 357-366
- Digalakis, V.¹ Rtischev, D.² Neumeyer, L.³

25
- 0001859044
- A technique for the measurement of attitudes
- R. Likert, "A technique for the measurement of attitudes," Arch. Psychol., vol. 140, pp. 1-55, 1932.
- (1932) Arch. Psychol. , vol.140 , pp. 1-55
- Likert, R.¹

26
- 67650832556
- Statistical analysis of the Blizzard Challenge 2007 listening test results
- R. Clark, M. Podsiadlo, M. Fraser, C. Mayo, and S. King, "Statistical analysis of the Blizzard Challenge 2007 listening test results," in Proc. Blizzard, 2007.
- (2007) Proc. Blizzard
- Clark, R.¹ Podsiadlo, M.² Fraser, M.³ Mayo, C.⁴ King, S.⁵

27
- 44949230930
- Europarl: A parallel corpus for statistical machine translation
- P. Koehn, "Europarl: A parallel corpus for statistical machine translation," in Proc. MT Summit, 2005.
- (2005) Proc. MT Summit
- Koehn, P.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.