메뉴 건너뛰기




Volumn 53, Issue 3, 2011, Pages 442-450

The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

Author keywords

Auditory scale; HMMs; HTS; Romanian; Sampling frequency; Speech synthesis

Indexed keywords

AUDITORY SCALE; HMMS; HTS; ROMANIAN; SAMPLING FREQUENCY;

EID: 79551478696     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2010.12.002     Document Type: Article
Times cited : (66)

References (31)
  • 1
    • 78049527800 scopus 로고    scopus 로고
    • The CereVoice characterful speech synthesiser SDK
    • Newcastle U.K
    • M. Aylett, and C. Pidcock The CereVoice characterful speech synthesiser SDK Proc. AISB 2007 2007 Newcastle U.K. 174 178
    • (2007) Proc. AISB 2007 , pp. 174-178
    • Aylett, M.1    Pidcock, C.2
  • 2
    • 0030166343 scopus 로고    scopus 로고
    • The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences
    • C. Benoît, M. Grice, and V. Hazan The SUS test: a method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences Speech Comm. 18 4 1996 381 392
    • (1996) Speech Comm. , vol.18 , Issue.4 , pp. 381-392
    • Benoît, C.1    Grice, M.2    Hazan, V.3
  • 3
    • 84966398940 scopus 로고
    • Optimising selection of units from speech database for concatenative synthesis
    • Black, A.; Cambpbell, N.; 1995. Optimising selection of units from speech database for concatenative synthesis. In: Proceedings of EUROSPEECH-95, pp. 581-584.
    • (1995) Proceedings of EUROSPEECH-95 , pp. 581-584
    • Black, A.1    Cambpbell, N.2
  • 5
    • 34347397051 scopus 로고    scopus 로고
    • A parser-based text preprocessor for Romanian language TTS synthesis
    • Budapest, Hungary
    • Burileanu, D.; Dan, C.; Sima, M.; Burileanu, C.; 1999. A parser-based text preprocessor for Romanian language TTS synthesis. In: Proc. EUROSPEECH-99. Budapest, Hungary, pp. 2063-2066.
    • (1999) Proc. EUROSPEECH-99 , pp. 2063-2066
    • Burileanu, D.1    Dan, C.2    Sima, M.3    Burileanu, C.4
  • 6
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the em algorithm
    • A. Dempster, N. Laird, and D. Rubin Maximum likelihood from incomplete data via the em algorithm J. Roy. Stat. Soc. Ser. B 39 1 1977 1 38
    • (1977) J. Roy. Stat. Soc. Ser. B , vol.39 , Issue.1 , pp. 1-38
    • Dempster, A.1    Laird, N.2    Rubin, D.3
  • 7
    • 33645743720 scopus 로고    scopus 로고
    • Speech acoustics and phonetics: Selected writings
    • Springer Netherlands
    • G. Fant Speech acoustics and phonetics: selected writings Chapter Speech Perception 2005 Springer Netherlands 199 220
    • (2005) Chapter Speech Perception , pp. 199-220
    • Fant, G.1
  • 10
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • Hunt, A. and Black, A.; 1996. Unit selection in a concatenative speech synthesis system using a large speech database. In: Proceedings of ICASSP-96, pp. 373-376.
    • (1996) Proceedings of ICASSP-96 , pp. 373-376
    • Hunt, A.1    Black, A.2
  • 12
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. Cheveigné Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds Speech Comm. 27 1999 187 207
    • (1999) Speech Comm. , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigné, A.3
  • 13
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
    • Kawahara, H.; Estill, J.; Fujimura, O.; 2001. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT. In: 2nd MAVEBA.
    • (2001) 2nd MAVEBA
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 14
    • 0347707081 scopus 로고
    • Sampling-frequency considerations in digital audio
    • T. Muraoka, Y. Yamada, and M. Yamazaki Sampling-frequency considerations in digital audio J. Audio Eng. Soc. 26 4 1978 252 256
    • (1978) J. Audio Eng. Soc. , vol.26 , Issue.4 , pp. 252-256
    • Muraoka, T.1    Yamada, Y.2    Yamazaki, M.3
  • 15
    • 44949143155 scopus 로고    scopus 로고
    • Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
    • Ohtani, Y.; Toda, T.; Saruwatari, H.; Shikano, K.; 2006. Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation. In: Proceedings of Interspeech 2006, pp. 2266-2269.
    • (2006) Proceedings of Interspeech 2006 , pp. 2266-2269
    • Ohtani, Y.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 16
    • 0016938506 scopus 로고
    • Auditory filter shapes derived with noise stimuli
    • R. Patterson Auditory filter shapes derived with noise stimuli J. Acous. Soc. Amer. 76 1982 640 654
    • (1982) J. Acous. Soc. Amer. , vol.76 , pp. 640-654
    • Patterson, R.1
  • 17
    • 0033906251 scopus 로고    scopus 로고
    • MDL-based context-dependent subword modeling for speech recognition
    • K. Shinoda, and T. Watanabe MDL-based context-dependent subword modeling for speech recognition J. Acous. Soc. Jpn. (E) 21 2000 79 86
    • (2000) J. Acous. Soc. Jpn. (E) , vol.21 , pp. 79-86
    • Shinoda, K.1    Watanabe, T.2
  • 19
    • 0001310760 scopus 로고
    • Spectral estimation of speech based on mel-cepstral representation
    • K. Tokuda, T. Kobayashi, T. Fukada, H. Saito, and S. Imai Spectral estimation of speech based on mel-cepstral representation IE ICE Trans. Fundam. J74-A 8 1991 1240 1248 (in Japanese)
    • (1991) IE ICE Trans. Fundam. , vol.74 , Issue.8 , pp. 1240-1248
    • Tokuda, K.1    Kobayashi, T.2    Fukada, T.3    Saito, H.4    Imai, S.5
  • 20
    • 85131821539 scopus 로고
    • Mel-generalized cepstral analysis - A unified approach to speech spectral estimation
    • Yokohama, Japan
    • Tokuda, K.; Kobayashi, T.; Masuko, T.; Imai, S.; 1994a. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation. In: Proceedings of ICSLP-94, Yokohama, Japan, pp. 1043-1046.
    • (1994) Proceedings of ICSLP-94 , pp. 1043-1046
    • Tokuda, K.1    Kobayashi, T.2    Masuko, T.3    Imai, S.4
  • 23
    • 78049403515 scopus 로고    scopus 로고
    • Simple methods for improving speaker-similarity of HMM-based speech synthesis
    • Dallas, TX
    • Yamagishi, J.; King, S.; 2010. Simple methods for improving speaker-similarity of HMM-based speech synthesis. In: Proceedings of ICASSP 2010. Dallas, TX, pp. 4610-4613.
    • (2010) Proceedings of ICASSP 2010 , pp. 4610-4613
    • Yamagishi, J.1    King, S.2
  • 24
    • 70449126171 scopus 로고    scopus 로고
    • The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge
    • Brisbane, Australia
    • Yamagishi, J.; Zen, H.; Wu, Y.-J.; Toda, T.; Tokuda, K.; 2008a. The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge. In: Proceedings of Blizzard Challenge 2008, Brisbane, Australia.
    • (2008) Proceedings of Blizzard Challenge 2008
    • Yamagishi, J.1    Zen, H.2    Wu Y., .-J.3    Toda, T.4    Tokuda, K.5
  • 25
    • 84867223798 scopus 로고    scopus 로고
    • Robustness of HMM-based speech synthesis
    • Brisbane, Australia
    • Yamagishi, J.; Ling, Z.; King, S.; 2008b. Robustness of HMM-based speech synthesis. In: Proceedings of Interspeech 2008. Brisbane, Australia, pp. 581-584.
    • (2008) Proceedings of Interspeech 2008 , pp. 581-584
    • Yamagishi, J.1    Ling, Z.2    King, S.3
  • 27
    • 33846405723 scopus 로고    scopus 로고
    • Details of Nitech IEICE Trans
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda Details of Nitech IEICE Trans HMM-based speech synthesis system for the Blizzard Challenge 2005 Inf. & Syst. E90-D 1 2007 325 333
    • (2007) Inf. & Syst. , vol.90 , Issue.1 , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 30
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A.W. Black Statistical parametric speech synthesis Speech Comm. 51 11 2009 1039 1064
    • (2009) Speech Comm. , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 31
    • 0002648826 scopus 로고
    • A model of loudness summation
    • E. Zwicker, and B. Scharf A model of loudness summation Psych. Rev. 72 1965 2 26
    • (1965) Psych. Rev. , vol.72 , pp. 2-26
    • Zwicker, E.1    Scharf, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.