메뉴 건너뛰기




Volumn , Issue , 2008, Pages 413-428

Basic Principles of Speech Synthesis

Author keywords

Dynamic Time Warping; Spectral Envelope; Speech Synthesis; Synthetic Speech; Vocal Tract

Indexed keywords


EID: 85075916050     PISSN: 25228692     EISSN: 25228706     Source Type: Book Series    
DOI: 10.1007/978-3-540-49127-9_19     Document Type: Chapter
Times cited : (12)

References (44)
  • 1
    • 0003757962 scopus 로고
    • Springer, Berlin, Heidelberg
    • J.L. Flanagan: Speech Analysis, Synthesis and Perception (Springer, Berlin, Heidelberg 1972) pp. 204– 210, http://www.haskins.yale.edu/featured/heads/ SIMULACRA/kempelen.html
    • (1972) Speech Analysis, Synthesis and Perception , pp. 204-210
    • Flanagan, J.L.1
  • 2
    • 0042675518 scopus 로고
    • A synthetic speaker
    • H. Dudley, R.R. Riesz, S.A. Watkins: A synthetic speaker, J. Franklin Inst. 227, 739–764 (1939), http://www.bell-labs.com/org/1133/Heritage/Vocoder/
    • (1939) J. Franklin Inst. , vol.227 , pp. 739-764
    • Dudley, H.1    Riesz, R.R.2    Watkins, S.A.3
  • 3
    • 85075941746 scopus 로고    scopus 로고
    • W3C Standard Generalized Markup Language: http://www.w3.org/MarkUp/SGML/
  • 4
    • 85075926801 scopus 로고    scopus 로고
    • (http://www.xml.com/pub/a/2004/10/20/ssml.html)
    • W3C Speech Synthesis Markup Language Version 1.0: http://www.w3.org/TR/2003/CR-speech-synthesis-20031218/(http://www.xml.com/pub/a/2004/10/20/ssml.html)
  • 6
    • 0016952322 scopus 로고
    • Linguistic use of segmental duration in English: Acoustic and perceptual evidence
    • D.H. Klatt: Linguistic use of segmental duration in English: Acoustic and perceptual evidence, J. Acoust. Soc. Am. 59, 1208–1221 (1976)
    • (1976) J. Acoust. Soc. Am. , vol.59 , pp. 1208-1221
    • Klatt, D.H.1
  • 7
    • 33745196452 scopus 로고    scopus 로고
    • Exemplar-based production of prosody: Evidence from segment and syllable durations
    • ed. by B. Bel, I. MarlienISCA, Grenoble
    • A. Schweitzer, B. Moebius: Exemplar-based production of prosody: Evidence from segment and syllable durations, Proc. Speech Prosody 2004 (Nara), ed. by B. Bel, I. Marlien (ISCA, Grenoble 2004)
    • (2004) Proc. Speech Prosody 2004 (Nara)
    • Schweitzer, A.1    Moebius, B.2
  • 9
    • 0742283611 scopus 로고
    • The role of quantitative modeling in the study of intonation
    • pp
    • H. Fujisaki: The role of quantitative modeling in the study of intonation, Proc. Int. Symp. Japanese Prosody (1992) pp. 163–174
    • (1992) Proc. Int. Symp. Japanese Prosody , pp. 163-174
    • Fujisaki, H.1
  • 11
    • 0025316435 scopus 로고
    • A three-dimensional model of tongue movement based on ultrasound and x-ray mi-crobeam data
    • M. Stone: A three-dimensional model of tongue movement based on ultrasound and x-ray mi-crobeam data, J. Acoust. Soc. Am. 87, 2207–2217 (1990)
    • (1990) J. Acoust. Soc. Am. , vol.87 , pp. 2207-2217
    • Stone, M.1
  • 12
    • 0025739174 scopus 로고
    • Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels
    • T. Baer, J.C. Gore, L.C. Gracco, P.W. Nye: Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels, J. Acoust. Soc. Am. 90, 799–828 (1991)
    • (1991) J. Acoust. Soc. Am. , vol.90 , pp. 799-828
    • Baer, T.1    Gore, J.C.2    Gracco, L.C.3    Nye, P.W.4
  • 13
    • 0016940126 scopus 로고
    • A model of articulatory dynamics and control
    • C.H. Coker: A model of articulatory dynamics and control, Proc. IEEE 64, 452–459 (1976)
    • (1976) Proc. IEEE , vol.64 , pp. 452-459
    • Coker, C.H.1
  • 14
    • 77956779481 scopus 로고
    • A dynamical approach to gestural patterning in speech production
    • E.L. Saltzman, K.G. Munhall: A dynamical approach to gestural patterning in speech production, Ecol. Psychol. 1(4), 333–382 (1989)
    • (1989) Ecol. Psychol. , vol.1 , Issue.4 , pp. 333-382
    • Saltzman, E.L.1    Munhall, K.G.2
  • 15
    • 0003515694 scopus 로고
    • Speech coding based on physiological models of speech production
    • ed. by S. Furui, M.M. Sondhi (Marcel Dekker, New York,) pp
    • J. Schroeter, M.M. Sondhi: Speech coding based on physiological models of speech production. In: Advances in Speech Signal Processing, ed. by S. Furui, M.M. Sondhi (Marcel Dekker, New York 1991) pp. 231–268
    • (1991) Advances in Speech Signal Processing , pp. 231-268
    • Schroeter, J.1    Sondhi, M.M.2
  • 17
    • 0018986665 scopus 로고
    • Software for a cascade/parallel formant synthesizer
    • D.H. Klatt: Software for a cascade/parallel formant synthesizer, J. Acoust. Soc. Am. 67, 971–995 (1980)
    • (1980) J. Acoust. Soc. Am. , vol.67 , pp. 971-995
    • Klatt, D.H.1
  • 18
    • 0036711819 scopus 로고    scopus 로고
    • A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn
    • H.M. Hanson, K.N. Stevens: A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn, J. Acoust. Soc. Am. 112, 1158–1182 (2002)
    • (2002) J. Acoust. Soc. Am. , vol.112 , pp. 1158-1182
    • Hanson, H.M.1    Stevens, K.N.2
  • 19
    • 21844464776 scopus 로고    scopus 로고
    • Combinatorial issues in text-to-speech synthesis, EuroSpeech ’97 5th European Conference on Speech Communication and
    • J.P.H. van Santen: Combinatorial issues in text-to-speech synthesis, EuroSpeech ’97 5th European Conference on Speech Communication and Technology 5, 2511–2514 (1997)
    • (1997) Technology , vol.5 , pp. 2511-2514
    • van Santen, J.P.H.1
  • 20
    • 85068112784 scopus 로고
    • Rule synthesis of speech from diadic units
    • J.P. Olive: Rule synthesis of speech from diadic units, Proc. ICASSP 77, 568–570 (1977)
    • (1977) Proc. ICASSP , vol.77 , pp. 568-570
    • Olive, J.P.1
  • 21
    • 0000813409 scopus 로고
    • Syllables as concatenative phonetic elements
    • ed. by A. Bell, J.B. Hooper (North-Holland, New York,) pp
    • O. Fujimura, J. Lovins: Syllables as concatenative phonetic elements. In: Syllables and Segments, ed. by A. Bell, J.B. Hooper (North-Holland, New York 1978) pp. 107–120
    • (1978) Syllables and Segments , pp. 107-120
    • Fujimura, O.1    Lovins, J.2
  • 23
    • 0023756465 scopus 로고
    • Speech synthesis by rule using an optimal selection of non-uniform synthesis units
    • Y. Sagisaka: Speech synthesis by rule using an optimal selection of non-uniform synthesis units, Proc. ICASSP 88, 679–682 (1988)
    • (1988) Proc. ICASSP , vol.88 , pp. 679-682
    • Sagisaka, Y.1
  • 24
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • A. Hunt, A.W. Black: Unit selection in a concatenative speech synthesis system using a large speech database, Proc. ICASSP 96, 373–376 (1996)
    • (1996) Proc. ICASSP 96 , pp. 373-376
    • Hunt, A.1    Black, A.W.2
  • 25
    • 84966398940 scopus 로고
    • Optimising selection of units from speech databases for concatenative synthesis
    • A.W. Black, N. Campbell: Optimising selection of units from speech databases for concatenative synthesis, ESCA Eurospeech 95, 581–584 (1995)
    • (1995) ESCA Eurospeech , vol.95 , pp. 581-584
    • Black, A.W.1    Campbell, N.2
  • 29
    • 85133526552 scopus 로고    scopus 로고
    • Automatically clustering similar units for unit selection in speech synthesis
    • A.W. Black, P. Taylor: Automatically clustering similar units for unit selection in speech synthesis, Proc. Eurospeech 97, 601–604 (1997)
    • (1997) Proc. Eurospeech 97 , pp. 601-604
    • Black, A.W.1    Taylor, P.2
  • 32
    • 0025543906 scopus 로고
    • Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • E. Moulines, F. Charpentier: Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones, Speech Commun. 9(5-6), 453–467 (1990)
    • (1990) Speech Commun , vol.9 , Issue.5-6 , pp. 453-467
    • Moulines, E.1    Charpentier, F.2
  • 33
    • 3643119290 scopus 로고
    • Intelligibility as a function of speech coding method for template-based speech synthesis
    • M. Macchi, M.J. Altom, D. Kahn, S. Singhal, M. Spiegel: Intelligibility as a function of speech coding method for template-based speech synthesis, Proc. Eurospeech 93, 893–896 (1993)
    • (1993) Proc. Eurospeech , vol.93 , pp. 893-896
    • Macchi, M.1    Altom, M.J.2    Kahn, D.3    Singhal, S.4    Spiegel, M.5
  • 35
    • 0026830163 scopus 로고
    • Shape invariant time-scale and pitch modification of speech
    • T.F. Quartieri, R.J. McAulay: Shape invariant time-scale and pitch modification of speech, IEEE Trans. Signal Process. 40(3), 497–510 (1992)
    • (1992) IEEE Trans. Signal Process. , vol.40 , Issue.3 , pp. 497-510
    • Quartieri, T.F.1    McAulay, R.J.2
  • 36
    • 0035127703 scopus 로고    scopus 로고
    • Applying the harmonic plus noise model in concatenative speech synthesis
    • Y. Stylianou: Applying the harmonic plus noise model in concatenative speech synthesis, IEEE Trans. Speech Audio Process. 9(1), 21–29 (2001)
    • (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.1 , pp. 21-29
    • Stylianou, Y.1
  • 37
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, A. de Cheveigne: Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, Speech Commun. 27(3-4), 187–207 (1999)
    • (1999) Speech Commun , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    de Cheveigne, A.3
  • 38
    • 0023739214 scopus 로고
    • Voice conversion through vector quantization
    • M. Abe, S. Nakamura, K. Shikano, H. Kuwahara: Voice conversion through vector quantization, Proc. IEEE ICASSP 88, 655–658 (1990), S14.1
    • (1990) Proc. IEEE ICASSP , vol.88 , Issue.655-658 , pp. S14.1
    • Abe, M.1    Nakamura, S.2    Shikano, K.3    Kuwahara, H.4
  • 40
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-to-speech synthesis
    • A. Kain, M. Macon: Spectral voice conversion for text-to-speech synthesis, Proc. IEEE ICASPP 98, 285–288 (1998)
    • (1998) Proc. IEEE ICASPP , vol.98 , pp. 285-288
    • Kain, A.1    Macon, M.2
  • 41
    • 0025475690 scopus 로고
    • Comprehensive assessment of the telephone intelligibility of synthesized and natural speech
    • M.F. Spiegel, M.J. Altom, M.J. Macchi: Comprehensive assessment of the telephone intelligibility of synthesized and natural speech, Speech Commun. 9, 279–291 (1990)
    • (1990) Speech Commun , vol.9 , pp. 279-291
    • Spiegel, M.F.1    Altom, M.J.2    Macchi, M.J.3
  • 44
    • 85009279016 scopus 로고    scopus 로고
    • The reliability of the ITU-T P.85 standard for the evaluation of text-to-speech systems
    • Y.V. Alvarez, M. Huckvale: The reliability of the ITU-T P.85 standard for the evaluation of text-to-speech systems, Proc. ICSLP 2002, 329–332 (2002)
    • (2002) Proc. ICSLP 2002 , pp. 329-332
    • Alvarez, Y.V.1    Huckvale, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.