SCOPUS 정보 검색 플랫폼

Volumn 53, Issue 3, 2011, Pages 442-450

The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

(4) Stan, Adriana b Yamagishi, Junichi a King, Simon a Aylett, Matthew c

a UNIVERSITY OF EDINBURGH (United Kingdom)

b TECHNICAL UNIVERSITY OF CLUJ NAPOCA (Romania)

c CEREPROC LTD (United Kingdom)

Author keywords

Auditory scale; HMMs; HTS; Romanian; Sampling frequency; Speech synthesis

Indexed keywords

AUDITORY SCALE; HMMS; HTS; ROMANIAN; SAMPLING FREQUENCY;

FEATURE EXTRACTION; SPEECH SYNTHESIS; TEXT PROCESSING; WORD PROCESSING;

SPEECH RECOGNITION;

EID: 79551478696 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2010.12.002 Document Type: Article

Times cited : (66)

References (31)

1
- 78049527800
- The CereVoice characterful speech synthesiser SDK
- Newcastle U.K
- M. Aylett, and C. Pidcock The CereVoice characterful speech synthesiser SDK Proc. AISB 2007 2007 Newcastle U.K. 174 178
- (2007) Proc. AISB 2007 , pp. 174-178
- Aylett, M.¹ Pidcock, C.²

2
- 0030166343
- The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences
- C. Benoît, M. Grice, and V. Hazan The SUS test: a method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences Speech Comm. 18 4 1996 381 392
- (1996) Speech Comm. , vol.18 , Issue.4 , pp. 381-392
- Benoît, C.¹ Grice, M.² Hazan, V.³

3
- 84966398940
- Optimising selection of units from speech database for concatenative synthesis
- Black, A.; Cambpbell, N.; 1995. Optimising selection of units from speech database for concatenative synthesis. In: Proceedings of EUROSPEECH-95, pp. 581-584.
- (1995) Proceedings of EUROSPEECH-95 , pp. 581-584
- Black, A.¹ Cambpbell, N.²

4
- 0003802343
- Wadsworth and Brooks Monterey, CA
- L. Breiman, J. Friedman, R. Olshen, and C. Stone Classification and Regression Trees 1984 Wadsworth and Brooks Monterey, CA
- (1984) Classification and Regression Trees
- Breiman, L.¹ Friedman, J.² Olshen, R.³ Stone, C.⁴

5
- 34347397051
- A parser-based text preprocessor for Romanian language TTS synthesis
- Budapest, Hungary
- Burileanu, D.; Dan, C.; Sima, M.; Burileanu, C.; 1999. A parser-based text preprocessor for Romanian language TTS synthesis. In: Proc. EUROSPEECH-99. Budapest, Hungary, pp. 2063-2066.
- (1999) Proc. EUROSPEECH-99 , pp. 2063-2066
- Burileanu, D.¹ Dan, C.² Sima, M.³ Burileanu, C.⁴

6
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- A. Dempster, N. Laird, and D. Rubin Maximum likelihood from incomplete data via the em algorithm J. Roy. Stat. Soc. Ser. B 39 1 1977 1 38
- (1977) J. Roy. Stat. Soc. Ser. B , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.¹ Laird, N.² Rubin, D.³

7
- 33645743720
- Speech acoustics and phonetics: Selected writings
- Springer Netherlands
- G. Fant Speech acoustics and phonetics: selected writings Chapter Speech Perception 2005 Springer Netherlands 199 220
- (2005) Chapter Speech Perception , pp. 199-220
- Fant, G.¹

8
- 79551480430
- Ph.D. thesis, University of Cluj-Napoca
- Ferencz, A.; 1997. Contributii la dezvoltarea sintezei text-vorbire pentru limba romana. Ph.D. thesis, University of Cluj-Napoca.
- (1997) Contributii la Dezvoltarea Sintezei Text-vorbire Pentru Limba Romana
- Ferencz, A.¹

9
- 79551504685
- A text processing tool for the Romanian language
- Cluj-Napoca
- Frunza, O.; Inkpen, D.; Nadeau, D.; 2005. A text processing tool for the Romanian language. In: Proceedings of EuroLAN 2005: Workshop on Cross-Language Knowledge Induction, Cluj-Napoca.
- (2005) Proceedings of EuroLAN 2005: Workshop on Cross-Language Knowledge Induction
- Frunza, O.¹ Inkpen, D.² Nadeau, D.³

10
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- Hunt, A. and Black, A.; 1996. Unit selection in a concatenative speech synthesis system using a large speech database. In: Proceedings of ICASSP-96, pp. 373-376.
- (1996) Proceedings of ICASSP-96 , pp. 373-376
- Hunt, A.¹ Black, A.²

11
- 67650790758
- The Blizzard Challenge 2008
- Brisbane, Australia
- Karaiskos, V.; King, S.; Clark, R.A.J.; Mayo, C.; 2008. The Blizzard Challenge 2008. In: Proceedings of Blizzard Challenge Workshop, Brisbane, Australia.
- (2008) Proceedings of Blizzard Challenge Workshop
- Karaiskos, V.¹ King, S.² Clark R. .A., .J.³ Mayo, C.⁴

12
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. Cheveigné Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds Speech Comm. 27 1999 187 207
- (1999) Speech Comm. , vol.27 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² Cheveigné, A.³

13
- 84874199000
- Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
- Kawahara, H.; Estill, J.; Fujimura, O.; 2001. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT. In: 2nd MAVEBA.
- (2001) 2nd MAVEBA
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

14
- 0347707081
- Sampling-frequency considerations in digital audio
- T. Muraoka, Y. Yamada, and M. Yamazaki Sampling-frequency considerations in digital audio J. Audio Eng. Soc. 26 4 1978 252 256
- (1978) J. Audio Eng. Soc. , vol.26 , Issue.4 , pp. 252-256
- Muraoka, T.¹ Yamada, Y.² Yamazaki, M.³

15
- 44949143155
- Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
- Ohtani, Y.; Toda, T.; Saruwatari, H.; Shikano, K.; 2006. Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation. In: Proceedings of Interspeech 2006, pp. 2266-2269.
- (2006) Proceedings of Interspeech 2006 , pp. 2266-2269
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

16
- 0016938506
- Auditory filter shapes derived with noise stimuli
- R. Patterson Auditory filter shapes derived with noise stimuli J. Acous. Soc. Amer. 76 1982 640 654
- (1982) J. Acous. Soc. Amer. , vol.76 , pp. 640-654
- Patterson, R.¹

17
- 0033906251
- MDL-based context-dependent subword modeling for speech recognition
- K. Shinoda, and T. Watanabe MDL-based context-dependent subword modeling for speech recognition J. Acous. Soc. Jpn. (E) 21 2000 79 86
- (2000) J. Acous. Soc. Jpn. (E) , vol.21 , pp. 79-86
- Shinoda, K.¹ Watanabe, T.²

18
- 0001481529
- Bark and ERB bilinear transforms
- J.O. Smith III, and J.S. Abel Bark and ERB bilinear transforms IEEE Trans. Speech Audio Process. 7 6 1999 697 708
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.6 , pp. 697-708
- Smith III, J.O.¹ Abel, J.S.²

19
- 0001310760
- Spectral estimation of speech based on mel-cepstral representation
- K. Tokuda, T. Kobayashi, T. Fukada, H. Saito, and S. Imai Spectral estimation of speech based on mel-cepstral representation IE ICE Trans. Fundam. J74-A 8 1991 1240 1248 (in Japanese)
- (1991) IE ICE Trans. Fundam. , vol.74 , Issue.8 , pp. 1240-1248
- Tokuda, K.¹ Kobayashi, T.² Fukada, T.³ Saito, H.⁴ Imai, S.⁵

20
- 85131821539
- Mel-generalized cepstral analysis - A unified approach to speech spectral estimation
- Yokohama, Japan
- Tokuda, K.; Kobayashi, T.; Masuko, T.; Imai, S.; 1994a. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation. In: Proceedings of ICSLP-94, Yokohama, Japan, pp. 1043-1046.
- (1994) Proceedings of ICSLP-94 , pp. 1043-1046
- Tokuda, K.¹ Kobayashi, T.² Masuko, T.³ Imai, S.⁴

21
- 78049412356
- Recursive calculation of mel-cepstrum from LP coefficients
- Tokuda, K.; Kobayashi, T.; Imai, S.; 1994b. Recursive calculation of mel-cepstrum from LP coefficients. In: Technical Report of Nagoya Institute of Technology.
- (1994) Technical Report of Nagoya Institute of Technology
- Tokuda, K.¹ Kobayashi, T.² Imai, S.³

22
- 0036522887
- Multi-space probability distribution HMM
- K. Tokuda, T. Masuko, N. Miyazaki, and T. Kobayashi Multi-space probability distribution HMM IEICE Trans. Inf. & Syst. E85-D 3 2002 455 464
- (2002) IEICE Trans. Inf. & Syst. , vol.85 , Issue.3 , pp. 455-464
- Tokuda, K.¹ Masuko, T.² Miyazaki, N.³ Kobayashi, T.⁴

23
- 78049403515
- Simple methods for improving speaker-similarity of HMM-based speech synthesis
- Dallas, TX
- Yamagishi, J.; King, S.; 2010. Simple methods for improving speaker-similarity of HMM-based speech synthesis. In: Proceedings of ICASSP 2010. Dallas, TX, pp. 4610-4613.
- (2010) Proceedings of ICASSP 2010 , pp. 4610-4613
- Yamagishi, J.¹ King, S.²

24
- 70449126171
- The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge
- Brisbane, Australia
- Yamagishi, J.; Zen, H.; Wu, Y.-J.; Toda, T.; Tokuda, K.; 2008a. The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge. In: Proceedings of Blizzard Challenge 2008, Brisbane, Australia.
- (2008) Proceedings of Blizzard Challenge 2008
- Yamagishi, J.¹ Zen, H.² Wu Y., .-J.³ Toda, T.⁴ Tokuda, K.⁵

25
- 84867223798
- Robustness of HMM-based speech synthesis
- Brisbane, Australia
- Yamagishi, J.; Ling, Z.; King, S.; 2008b. Robustness of HMM-based speech synthesis. In: Proceedings of Interspeech 2008. Brisbane, Australia, pp. 581-584.
- (2008) Proceedings of Interspeech 2008 , pp. 581-584
- Yamagishi, J.¹ Ling, Z.² King, S.³

26
- 0002144369
- Tree-based state tying for high accuracy acoustic modeling
- Young, S.; Odell, J.; Woodland, P.; 1994. Tree-based state tying for high accuracy acoustic modeling. In: Proceedings of ARPA Human Language Technology Workshop, pp. 307-312.
- (1994) Proceedings of ARPA Human Language Technology Workshop , pp. 307-312
- Young, S.¹ Odell, J.² Woodland, P.³

27
- 33846405723
- Details of Nitech IEICE Trans
- H. Zen, T. Toda, M. Nakamura, and K. Tokuda Details of Nitech IEICE Trans HMM-based speech synthesis system for the Blizzard Challenge 2005 Inf. & Syst. E90-D 1 2007 325 333
- (2007) Inf. & Syst. , vol.90 , Issue.1 , pp. 325-333
- Zen, H.¹ Toda, T.² Nakamura, M.³ Tokuda, K.⁴

28
- 44449177634
- A hidden semi-Markov model-based speech synthesis system
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura A hidden semi-Markov model-based speech synthesis system IEICE Trans. Inf. & Syst. E90-D 5 2007 825 834
- (2007) IEICE Trans. Inf. & Syst. , Issue.5 , pp. 825-834
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

29
- 85133720638
- The HMM-based speech synthesis system (HTS) version 2.0
- Zen, H.; Nose, T.; Yamagishi, J.; Sako, S.; Tokuda, K.; 2007c. The HMM-based speech synthesis system (HTS) version 2.0. In: Proceedings of Sixth ISCA Workshop on Speech Synthesis, pp. 294-299.
- (2007) Proceedings of Sixth ISCA Workshop on Speech Synthesis , pp. 294-299
- Zen, H.¹ Nose, T.² Yamagishi, J.³ Sako, S.⁴ Tokuda, K.⁵

30
- 67651002140
- Statistical parametric speech synthesis
- H. Zen, K. Tokuda, and A.W. Black Statistical parametric speech synthesis Speech Comm. 51 11 2009 1039 1064
- (2009) Speech Comm. , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Black, A.W.³

31
- 0002648826
- A model of loudness summation
- E. Zwicker, and B. Scharf A model of loudness summation Psych. Rev. 72 1965 2 26
- (1965) Psych. Rev. , vol.72 , pp. 2-26
- Zwicker, E.¹ Scharf, B.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.