메뉴 건너뛰기




Volumn 48, Issue 9, 2006, Pages 1057-1078

Tone-Group F0 selection for modeling focus prominence in small-footprint speech synthesis

Author keywords

Intonation and emphasis in speech synthesis; Text to speech synthesis; Tone Group unit selection

Indexed keywords

DATABASE SYSTEMS; HUMAN ENGINEERING; MATHEMATICAL MODELS; OPTIMIZATION; REGRESSION ANALYSIS; SPEECH ANALYSIS;

EID: 33746431355     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2006.02.002     Document Type: Article
Times cited : (5)

References (62)
  • 1
    • 0029342671 scopus 로고
    • Automatic pitch contour stylization using a model of tonal perception
    • d'Alessandro C., and Mertens P. Automatic pitch contour stylization using a model of tonal perception. Comput. Speech Language 9 (1995) 257-288
    • (1995) Comput. Speech Language , vol.9 , pp. 257-288
    • d'Alessandro, C.1    Mertens, P.2
  • 2
    • 33646268734 scopus 로고    scopus 로고
    • Intonational analysis and prosodic annotation of Greek spoken corpora
    • Sun-Ah Jun (Ed), Oxford University Press
    • Arvaniti A., and Baltazani M. Intonational analysis and prosodic annotation of Greek spoken corpora. In: Sun-Ah Jun (Ed). Prosodic Typology: The Phonology of Intonation and Phrasing (2005), Oxford University Press 84-117
    • (2005) Prosodic Typology: The Phonology of Intonation and Phrasing , pp. 84-117
    • Arvaniti, A.1    Baltazani, M.2
  • 3
    • 0031627266 scopus 로고    scopus 로고
    • Stability of tonal alignment: the case of Greek prenuclear accents
    • Arvaniti A., Ladd D.R., and Mennen I. Stability of tonal alignment: the case of Greek prenuclear accents. J. Phonetics 26 (1998) 3-25
    • (1998) J. Phonetics , vol.26 , pp. 3-25
    • Arvaniti, A.1    Ladd, D.R.2    Mennen, I.3
  • 4
    • 33746471050 scopus 로고    scopus 로고
    • Aulanko, R., 1985. Microprosodic features in speech: experiments on Finnish. In: Aaltonen, O., Hulkko, T. (Eds.), Fonetiikan Paivat Turku 1985, Publications of the Department of Finnish and General Linguistics of the University of Turku, pp. 33-54.
  • 5
    • 21844440585 scopus 로고    scopus 로고
    • SFC: A trainable prosodic model
    • Bailly G., and Holm B. SFC: A trainable prosodic model. Speech Comm. 46 (2005) 364-384
    • (2005) Speech Comm. , vol.46 , pp. 364-384
    • Bailly, G.1    Holm, B.2
  • 6
    • 33746447512 scopus 로고    scopus 로고
    • Beutnagel, M., Conkie, A., Schroeter, J., Stylianou, Y., Sydral, A., 1999. The AT&T Next-Gen TTS system. In: Proc. Joint Meeting of ASA, EAA and DAGA, Berling, Germany, pp. 18-24.
  • 7
    • 85006631929 scopus 로고    scopus 로고
    • Black, A.W., 2003. Unit Selection and Emotional Speech. In: Proc. EUROSPEECH-2003, Geneva, Switzerland, pp. 1649-1652.
  • 8
    • 0030355540 scopus 로고    scopus 로고
    • 0 contours from the ToBI labels using linear regression. In: Proc. ICSLP-96, Philadelphia, USA, Vol. 3, pp. 1385-1388.
  • 9
    • 84966301419 scopus 로고    scopus 로고
    • Black, A.W., Lenzo, K.A., 2000a. Limited domain synthesis. In: Proc. ICSLP-2000, Beijing, China, Vol. 2, pp. 411-414.
  • 10
    • 33746443185 scopus 로고    scopus 로고
    • Black, A.W., Lenzo, K.A., 2000b. Building voices in the Festival speech synthesis System. Available from: .
  • 11
    • 33746415062 scopus 로고    scopus 로고
    • Black, A.W., Lenzo, K.A., 2001. Flite: a small fast run-time synthesis engine. In: Proc. SSW4 - 4th ISCA Workshop on Speech Synthesis, pp. 204-207.
  • 12
    • 0142153902 scopus 로고    scopus 로고
    • Optimal utterance selection for unit selection speech synthesis databases
    • Black A.W., and Lenzo K. Optimal utterance selection for unit selection speech synthesis databases. Internat. J. Speech Technol. 6 4 (2003) 357-363
    • (2003) Internat. J. Speech Technol. , vol.6 , Issue.4 , pp. 357-363
    • Black, A.W.1    Lenzo, K.2
  • 13
    • 33746444147 scopus 로고    scopus 로고
    • Black, A.W., Taylor, P., Caley, R., 1998. The FESTIVAL speech synthesis system. Available from: .
  • 14
    • 85143191594 scopus 로고    scopus 로고
    • Bulyko, I., Ostendorf, M., 2001. Joint prosody prediction and unit selection for concatenative speech synthesis. In: Proc. ICASSP-2001, Vol. 2, pp. 781-784.
  • 16
    • 33746459629 scopus 로고    scopus 로고
    • Campbell, N., 1994. Prosody and the selection of units for concatenation synthesis. In: Proc. SSW2 - 2nd ESCA/IEEE Workshop on Speech Synthesis, NY, USA, pp. 61-64.
  • 17
    • 24144437793 scopus 로고    scopus 로고
    • Developments in corpus-based speech synthesis: approaching natural conversational speech
    • Campbell N. Developments in corpus-based speech synthesis: approaching natural conversational speech. IEICE Trans. Inf. Syst. E88-D 3 (2005) 376-383
    • (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 376-383
    • Campbell, N.1
  • 18
    • 33746458126 scopus 로고    scopus 로고
    • Clark, R., 2003. Generating synthetic pitch contours using prosodic structure. Ph.D. Dissertation, University of Edinburgh.
  • 19
    • 33746390000 scopus 로고    scopus 로고
    • Conkie, A., Isard, I., 1994. Optimal coupling of diphones. In: Proc. SSW2 - 2nd ESCA/IEEE Workshop on Speech Synthesis, NY, USA, pp. 119-122.
  • 20
    • 33746423764 scopus 로고    scopus 로고
    • Donovan, R., Woodland, P., 1995. Improvements in a HMM-based speech synthesizer. In: Proc. EUROSPEECH-95, Madrid, Spain, Vol. 1, pp. 573-576.
  • 21
    • 33746408436 scopus 로고    scopus 로고
    • 0 contours for speech synthesis using the tilt intonation theory. In: Botinis, A., Kouroupetroglou, G., Carayiannis, G. (Eds.), Intonation: Theory, Models and Applications. Proc. ESCA Workshop, Athens, pp. 107-110.
  • 23
    • 0030355972 scopus 로고    scopus 로고
    • Dutoit, T., Pagel, V., Pierret, N., Bataille, F., Van Der Vreken, O., 1996. The MBROLA Project: towards a set of high-quality speech synthesizers free of use for non-commercial purposes. In: Proc. ICSLP-96, Philadelphia, Vol. 3, pp. 1393-1396.
  • 24
    • 33746445740 scopus 로고    scopus 로고
    • Eide, E., Aaron, A., Bakis, R., Hamza, W., Picheny, M., Pitrelli, J., 2003. A corpus-based approach to expressive speech synthesis. In: Proc. SSW5 - 5th ISCA ITRW on Speech Synthesis, Pittsburgh, PA, USA, pp 79-84.
  • 25
    • 0032618167 scopus 로고    scopus 로고
    • Acoustic characteristics of Greek vowels
    • Fourakis M., Botinis A., and Katsaiti M. Acoustic characteristics of Greek vowels. Phonetica 56 1-2 (1999) 28-43
    • (1999) Phonetica , vol.56 , Issue.1-2 , pp. 28-43
    • Fourakis, M.1    Botinis, A.2    Katsaiti, M.3
  • 26
    • 0037380186 scopus 로고    scopus 로고
    • The role of voice quality in communicating emotion, mood and attitude
    • Gobl C., and Chasaide A.N. The role of voice quality in communicating emotion, mood and attitude. Speech Comm. 40 1-2 (2003) 189-212
    • (2003) Speech Comm. , vol.40 , Issue.1-2 , pp. 189-212
    • Gobl, C.1    Chasaide, A.N.2
  • 28
    • 33746410399 scopus 로고    scopus 로고
    • Hitzeman, J., Black, A.W., Mellish, C., Oberlander, J., Poesio, M., Taylor, P., 1999. An annotation scheme for concept-to-speech synthesis. In: Proc. 7th European Workshop on Natural Language Generation, Toulouse, France, pp. 59-66.
  • 29
    • 33746448144 scopus 로고    scopus 로고
    • Huang, X., Acero, A., Adcock, J., Hon, H., Goldsmith, J., Liu, J., Plumpe, M., 1996. Whistler: a trainable text-to-speech system. In: Proc. ICSLP-96, Philadelphia, PA, pp. 659-662.
  • 30
    • 0029765811 scopus 로고    scopus 로고
    • Hunt, A., Black, A.W., 1996. Unit selection in a concatenative speech synthesis system using a large speech database. In: Proc. ICASSP-96, Vol. 1, pp. 373-376.
  • 31
    • 85050787615 scopus 로고    scopus 로고
    • How much prosody can you learn from twenty utterances?
    • Keller E., and Keller B.Z. How much prosody can you learn from twenty utterances?. Linguistik Online 17 5 (2003) 57-79
    • (2003) Linguistik Online , vol.17 , Issue.5 , pp. 57-79
    • Keller, E.1    Keller, B.Z.2
  • 32
    • 85009179208 scopus 로고    scopus 로고
    • Kishore, S.P., Black, A.W., 2003. Unit size in unit selection speech synthesis. In: Proc. EUROSPEECH-2003, Geneva, Switzerland, pp. 1317-1320.
  • 33
    • 84928453855 scopus 로고
    • Intonational phrasing: the case for recursive prosodic structure
    • Ladd D.R. Intonational phrasing: the case for recursive prosodic structure. Phonology 3 (1986) 311-340
    • (1986) Phonology , vol.3 , pp. 311-340
    • Ladd, D.R.1
  • 35
    • 33746429364 scopus 로고    scopus 로고
    • Malfrere, F., Dutoit, T., Mertens, P., 1998. Automatic prosody generation using supra-segmental unit selection. In: SSW3 - 3rd ESCA/COCOSDA Workshop on Speech Synthesis, Blue Mountains, Australia, pp. 323-328.
  • 36
    • 33746392201 scopus 로고    scopus 로고
    • Meron, J., 2001. Prosodic unit selection using an imitation speech database. In: Proc. SSW4 - 4th ISCA ITRW on Speech Synthesis, Perthshire, Scotland, 113.
  • 37
    • 33746385244 scopus 로고    scopus 로고
    • Monaghan, A.I.C., 1992. Extracting microprosodic information from diphones - a simple way to model segmental effects on prosody for synthetic speech. In: ICSLP-1992, Banff, Canada, pp. 1159-1162.
  • 38
    • 0025543906 scopus 로고
    • Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • Moulines E., and Charpentier F. Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Comm. 9 5/6 (1990) 453-467
    • (1990) Speech Comm. , vol.9 , Issue.5-6 , pp. 453-467
    • Moulines, E.1    Charpentier, F.2
  • 39
    • 33746462569 scopus 로고    scopus 로고
    • Mozziconacci, S.J., 2000. The expression of emotion considered in the framework of an intonation model. In: Proc. ISCA/ITRW on Speech and Emotion, Belfast, Northern Ireland, pp. 45-52.
  • 40
    • 33746437840 scopus 로고    scopus 로고
    • Mozziconacci, S., Hermes, D.J., 1999. Role of Intonation Patterns in Conveying Emotion in Speech. In: Proc. Internat. Conf. of Phonetic Sciences, pp. 2001-2004.
  • 42
    • 33746411331 scopus 로고    scopus 로고
    • Pierrehumbert, J.B., 1980. The Phonology and Phonetics of English Intonation. Ph.D. Dissertation, MIT.
  • 43
    • 44849128406 scopus 로고    scopus 로고
    • Pitrelli, J.F., Eide, E.M., 2003. Expressive speech synthesis using American English ToBI: questions and contrastive emphasis. In: Proc. IEEE ASRU-2003, pp. 694-699.
  • 44
    • 33746449039 scopus 로고    scopus 로고
    • Quazza, S., Donetti, L., Moisa, L., Salza, P.L., 2001. ACTOR: A multilingual unit-selection speech synthesis system. In Proc. SSW4 - 4th ISCA ITRW on Speech Synthesis, Pertshire, Scotland, paper 209.
  • 45
    • 84946736935 scopus 로고    scopus 로고
    • 0 modeling and its application to emphasis. In: Proc. IEEE ASRU-2003, pp. 700-705.
  • 46
    • 84971539709 scopus 로고    scopus 로고
    • Schroeder, M., 2001. Emotional speech synthesis: a review. In: Proc. EUROSPEECH-2001, Aalborg, Denmark, Vol. 1, pp. 561-564.
  • 47
    • 33745203492 scopus 로고    scopus 로고
    • Schweitzer, A., Braunschweiler, N., Klankert, T., Mobius, B., Sauberlich, B., 2003. Restricted unlimited domain synthesis, In Proc. EUROSPEECH-2003, Geneva, Switzerland, pp. 1321-1324.
  • 48
    • 0010992336 scopus 로고
    • On prosodic structure and its relation to syntactic structure
    • Fretheim T. (Ed), TAPIR, Trodheim
    • Selkirk E. On prosodic structure and its relation to syntactic structure. In: Fretheim T. (Ed). Nordic Prosody 2 (1978), TAPIR, Trodheim
    • (1978) Nordic Prosody 2
    • Selkirk, E.1
  • 49
    • 33746418764 scopus 로고    scopus 로고
    • Selkirk, E., 1986. On derived domains in sentence phonology. In: Phonology Yearbook, Vol. 3, 371-405.
  • 50
    • 33746410398 scopus 로고    scopus 로고
    • Selkirk, E., 1995. The prosodic structure of function words. University of Massachusetts Occasional Papers 18: Papers in Optimality Theory, pp. 439-469.
  • 51
    • 33746405937 scopus 로고    scopus 로고
    • Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., Hirschberg, J., 1992. ToBI: a standard for labeling English prosody. In: Proc. ICSLP-92, pp. 867-870.
  • 53
    • 0034008810 scopus 로고    scopus 로고
    • Analysis and synthesis of intonation using the Tilt model
    • Taylor P. Analysis and synthesis of intonation using the Tilt model. J. Acoust. Soc. Am. 107 3 (2000) 1697-1714
    • (2000) J. Acoust. Soc. Am. , vol.107 , Issue.3 , pp. 1697-1714
    • Taylor, P.1
  • 54
    • 0035155093 scopus 로고    scopus 로고
    • Heterogeneous Relation Graphs as a mechanism for representing linguistic information
    • Taylor P., Black A.W., and Caley R. Heterogeneous Relation Graphs as a mechanism for representing linguistic information. Speech Comm. 33 (2001) 153-174
    • (2001) Speech Comm. , vol.33 , pp. 153-174
    • Taylor, P.1    Black, A.W.2    Caley, R.3
  • 55
    • 33746408880 scopus 로고    scopus 로고
    • Vainio, M., 2001. Artificial neural network based prosody models for Finnish Text-to-Speech synthesis. Ph.D. Thesis, University of Helsinki, Department of Phonetics.
  • 57
    • 33746410879 scopus 로고    scopus 로고
    • Wightman, C., Syrdal, A., Stemmer, G., Conkie, A., Beutnagel, M., 2000. Perceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative speech synthesis. In: Proc. ICSLP-2000, Vol. 2, pp. 71-74.
  • 58
    • 0036193374 scopus 로고    scopus 로고
    • Maximum speed of pitch change and how it may relate to speech
    • Xub Y., and Sun X. Maximum speed of pitch change and how it may relate to speech. J. Acoust. Soc. Am. 111 3 (2002) 1388-1413
    • (2002) J. Acoust. Soc. Am. , vol.111 , Issue.3 , pp. 1388-1413
    • Xub, Y.1    Sun, X.2
  • 59
    • 33746398750 scopus 로고    scopus 로고
    • Xydas, G., Kouroupetroglou, G., 2001. The DEMOSTHeNES Speech Composer. In: Proc. SSW4 - 4th ISCA ITRW on Speech Synthesis, Perthshire, Scotland, paper 206, pp. 167-172.
  • 60
    • 85009113574 scopus 로고    scopus 로고
    • 0 samples. In: Proc. ICSLP-2004, Vol. 1, pp. 801-804.
  • 61
    • 24144446449 scopus 로고    scopus 로고
    • Modeling improved prosody generation from high-level linguistically annotated corpora
    • Xydas G., Spiliotopoulos D., and Kouroupetroglou G. Modeling improved prosody generation from high-level linguistically annotated corpora. IEICE Trans. Inf. Syst. E88-D 3 (2005) 510-518
    • (2005) IEICE Trans. Inf. Syst. , vol.E88-D , Issue.3 , pp. 510-518
    • Xydas, G.1    Spiliotopoulos, D.2    Kouroupetroglou, G.3
  • 62
    • 33746466438 scopus 로고    scopus 로고
    • Zervas, P., Fakotakis, N., Kokkinakis, G., 2005. Development of a prosodic database for Greek speech synthesis. In: Proc. SPECOM 2005 - 10th International Conference on Speech and Computer, Patras, Greece, Vol. 2, pp. 603-606.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.