메뉴 건너뛰기




Volumn , Issue , 2010, Pages 71-91

Advances in computer speech synthesis and implications for assistive technology

Author keywords

[No Author keywords available]

Indexed keywords


EID: 84899214271     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.4018/978-1-61520-725-1.ch005     Document Type: Chapter
Times cited : (7)

References (48)
  • 1
    • 0015112070 scopus 로고
    • Speech analysis and synthesis by linear prediction of speech wave
    • doi:10.1121/1.1912679
    • Atal, B. S., & Hanauer, S. L. (1971). Speech analysis and synthesis by linear prediction of Speech Wave. The Journal of the Acoustical Society of America, 50(2b), 637-655. doi:10.1121/1.1912679
    • (1971) The Journal of the Acoustical Society of America , vol.50 , Issue.2 , pp. 637-655
    • Atal, B.S.1    Hanauer, S.L.2
  • 2
    • 0030166343 scopus 로고    scopus 로고
    • The sus test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences
    • doi:10.1016/0167-6393(96)00026-X
    • Benoît, C., Grice, M., & Hazan, V. (1996). The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using Semantically Unpredictable Sentences. Speech Communication, 18(4), 381-392. doi:10.1016/0167-6393(96)00026-X
    • (1996) Speech Communication , vol.18 , Issue.4 , pp. 381-392
    • Benoît, C.1    Grice, M.2    Hazan, V.3
  • 3
    • 33745216749 scopus 로고    scopus 로고
    • The blizzard challenge - 2005: Evaluating corpus-based speech synthesis on common datasets
    • Black, A., & Tokuda, K. (2005). The Blizzard Challenge - 2005: Evaluating corpus-based speech synthesis on common datasets. INTER--SPEECH-2005, 77-80.
    • (2005) INTER--SPEECH-2005 , pp. 77-80
    • Black, A.1    Tokuda, K.2
  • 4
    • 85039153976 scopus 로고    scopus 로고
    • A biphone constrained concatenation method for diphone synthesis
    • Bunnell, H. T., Hoskins, S. R., & Yarrington, D. M. (1998). A biphone constrained concatenation method for diphone synthesis. SSW3-1998, 171-176.
    • (1998) SSW3-1998 , pp. 171-176
    • Bunnell, H.T.1    Hoskins, S.R.2    Yarrington, D.M.3
  • 5
    • 84899187749 scopus 로고    scopus 로고
    • Schwa variants in american english
    • Bunnell, H. T., & Lilley, J. (2008). Schwa variants in American English. Proceedings: Interspeech, 2008, 1159-1162.
    • (2008) Proceedings: Interspeech , vol.2008 , pp. 1159-1162
    • Bunnell, H.T.1    Lilley, J.2
  • 7
    • 84941167756 scopus 로고
    • Optimal coupling of diphones
    • Conkie, A., & Isard, S. (1994). Optimal coupling of diphones. SSW2-1994, 119-122.
    • (1994) SSW2-1994 , pp. 119-122
    • Conkie, A.1    Isard, S.2
  • 8
    • 0023222647 scopus 로고
    • Intelligibility of average talkers in typical listening environments
    • doi:10.1121/1.394512
    • Cox, R. M., Alexander, G. C., & Gilmore, C. (1987). Intelligibility of average talkers in typical listening environments. The Journal of the Acoustical Society of America, 81(5), 1598-1608. doi:10.1121/1.394512
    • (1987) The Journal of the Acoustical Society of America , vol.81 , Issue.5 , pp. 1598-1608
    • Cox, R.M.1    Alexander, G.C.2    Gilmore, C.3
  • 10
    • 0001109477 scopus 로고
    • Coarticulation and theories of extrinsic timing
    • Fowler, C. A. (1980). Coarticulation and theories of extrinsic timing. Journal of Phonetics, 8, 113-133.
    • (1980) Journal of Phonetics , vol.8 , pp. 113-133
    • Fowler, C.A.1
  • 11
    • 56849127909 scopus 로고    scopus 로고
    • The breadth of coarticulatory units in children and adults
    • doi:10.1044/1092-4388(2008/07-0020)
    • Goffman, L., Smith, A., Heisler, L., & Ho, M. (2008). The breadth of coarticulatory units in children and adults. Journal of Speech, Language, and Hearing Research: JSLHR, 51(6), 1424-1437. doi:10.1044/1092-4388(2008/07-0020)
    • (2008) Journal of Speech, Language, and Hearing Research: JSLHR , vol.51 , Issue.6 , pp. 1424-1437
    • Goffman, L.1    Smith, A.2    Heisler, L.3    Ho, M.4
  • 13
    • 0036711819 scopus 로고    scopus 로고
    • A quasiarticulatory approach to controlling acoustic source parameters in a klatt-type formant synthesizer using hlsyn
    • doi:10.1121/1.1498851
    • Hanson, H. M., & Stevens, K. N. (2002). A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn. The Journal of the Acoustical Society of America, 112(3), 1158-1182. doi:10.1121/1.1498851
    • (2002) The Journal of the Acoustical Society of America , vol.112 , Issue.3 , pp. 1158-1182
    • Hanson, H.M.1    Stevens, K.N.2
  • 14
    • 0001074490 scopus 로고
    • From phoneme to morpheme
    • doi:10.2307/411036
    • Harris, Z. S. (1955). From phoneme to morpheme. Language, 31(2), 190-222. doi:10.2307/411036
    • (1955) Language , vol.31 , Issue.2 , pp. 190-222
    • Harris, Z.S.1
  • 15
    • 44949180195 scopus 로고
    • A nucleus-based timing model applied to multi-dialect speech synthesis by rule
    • Hertz, S. R., & Huffman, M. K. (1992). A nucleus-based timing model applied to multi-dialect speech synthesis by rule. ICSLP-1992, 1171-1174.
    • (1992) ICSLP-1992 , pp. 1171-1174
    • Hertz, S.R.1    Huffman, M.K.2
  • 16
    • 84899161284 scopus 로고
    • Research on speech synthesis carried out during a visit to the royal institute of technology, stockholm, from november 1960 to march 1961
    • Holmes, J. N. (1961). Research on Speech Synthesis Carried out during a Visit to the Royal Institute of Technology, Stockholm, from November 1960 to March 1961. Joint Speech Resear4ch Unit Report JU 11.4, British Post Office, Eastcote, England.
    • (1961) Joint Speech Resear4ch Unit Report JU 11.4, British Post Office, Eastcote, England
    • Holmes, J.N.1
  • 17
    • 0015699693 scopus 로고
    • The influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer
    • Holmes, J. N. (1973). The influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer. IEEE Trans., AU--21, 298-305.
    • (1973) IEEE Trans., AU--21 , pp. 298-305
    • Holmes, J.N.1
  • 18
    • 72249121867 scopus 로고    scopus 로고
    • Vocalid: Personalizing text-to-speech synthesis for individuals with severe speech impairment
    • In
    • Jreige, C., Patel, R., & Bunnell, H. T. (2009). VocaliD: Personalizing Text-to-Speech Synthesis for Individuals with Severe Speech Impairment. In Proceedings of ASSETS 2009.
    • (2009) Proceedings of ASSETS 2009
    • Jreige, C.1    Patel, R.2    Bunnell, H.T.3
  • 19
    • 0018986665 scopus 로고
    • Software for a cascade/ parallel formant synthesizer
    • doi:10.1121/1.383940
    • Klatt, D. H. (1980). Software for a cascade/ parallel formant synthesizer. The Journal of the Acoustical Society of America, 67(3), 971-995. doi:10.1121/1.383940
    • (1980) The Journal of the Acoustical Society of America , vol.67 , Issue.3 , pp. 971-995
    • Klatt, D.H.1
  • 20
    • 0023407575 scopus 로고
    • Review of text-to-speech conversion for english
    • doi:10.1121/1.395275
    • Klatt, D. H. (1987). Review of text-to-speech conversion for English. The Journal of the Acoustical Society of America, 82(3), 737-793. doi:10.1121/1.395275
    • (1987) The Journal of the Acoustical Society of America , vol.82 , Issue.3 , pp. 737-793
    • Klatt, D.H.1
  • 21
    • 0026206653 scopus 로고
    • Comparing discrimination and recognition of unfamiliar voices
    • doi:10.1016/0167-6393(91)90016-M
    • Kreiman, J., & Papcun, G. (1991). Comparing discrimination and recognition of unfamiliar voices. Speech Communication, 10(3), 265-275. doi:10.1016/0167-6393(91)90016-M
    • (1991) Speech Communication , vol.10 , Issue.3 , pp. 265-275
    • Kreiman, J.1    Papcun, G.2
  • 22
    • 0015404068 scopus 로고
    • On the perception of coarticulation effects in english vcv syllables
    • Lehiste, I., & Shockey, L. (1972). On the perception of coarticulation effects in English VCV syllables. Journal of Speech and Hearing Research, 15(3), 500-506.
    • (1972) Journal of Speech and Hearing Research , vol.15 , Issue.3 , pp. 500-506
    • Lehiste, I.1    Shockey, L.2
  • 23
    • 0024344665 scopus 로고
    • Segmental intelligibility of synthetic speech produced by rule
    • doi:10.1121/1.398236
    • Logan, J. S., Greene, B. G., & Pisoni, D. B. (1989). Segmental intelligibility of synthetic speech produced by rule. The Journal of the Acoustical Society of America, 86(2), 566-581. doi:10.1121/1.398236
    • (1989) The Journal of the Acoustical Society of America , vol.86 , Issue.2 , pp. 566-581
    • Logan, J.S.1    Greene, B.G.2    Pisoni, D.B.3
  • 25
    • 0019531333 scopus 로고
    • Perception of anticipatory coarticulation effects
    • doi:10.1121/1.385484
    • Martin, J. G., & Bunnell, H. T. (1981). Perception of anticipatory coarticulation effects. The Journal of the Acoustical Society of America, 69(2), 559-567. doi:10.1121/1.385484
    • (1981) The Journal of the Acoustical Society of America , vol.69 , Issue.2 , pp. 559-567
    • Martin, J.G.1    Bunnell, H.T.2
  • 26
    • 0020145847 scopus 로고
    • Perception of anticipatory coarticulation effects in vowel-stop consonant-bowel sequences
    • doi:10.1037/0096-1523.8.3.473
    • Martin, J. G., & Bunnell, H. T. (1982). Perception of anticipatory coarticulation effects in vowel-stop consonant-bowel sequences. Journal of Experimental Psychology. Human Perception and Performance, 8(3), 473-488. doi:10.1037/0096-1523.8.3.473
    • (1982) Journal of Experimental Psychology. Human Perception and Performance , vol.8 , Issue.3 , pp. 473-488
    • Martin, J.G.1    Bunnell, H.T.2
  • 27
    • 0015613574 scopus 로고
    • Articulatory model for the study of speech production
    • doi:10.1121/1.1913427
    • Mermelstein, P. (1973). Articulatory model for the study of speech production. The Journal of the Acoustical Society of America, 53(4), 1070-1082. doi:10.1121/1.1913427
    • (1973) The Journal of the Acoustical Society of America , vol.53 , Issue.4 , pp. 1070-1082
    • Mermelstein, P.1
  • 28
    • 0025543906 scopus 로고
    • Pitch-synchronous wave-form processing techniques for text-to-speech synthesis using diphones
    • doi:10.1016/0167-6393(90)90021-Z
    • Moulines, E., & Charpentier, F. (1990). Pitch-synchronous wave-form processing techniques for Text-to-Speech synthesis using diphones. Speech Communication, 9(5-6), 453-467. doi:10.1016/0167-6393(90)90021-Z
    • (1990) Speech Communication , vol.9 , Issue.5-6 , pp. 453-467
    • Moulines, E.1    Charpentier, F.2
  • 29
    • 0026660215 scopus 로고
    • The influence of talker differences on vowel identification by normal-hearing and hearing-impaired listeners
    • doi:10.1121/1.403973
    • Nabelek, A. K., Czyzewski, Z., Krishnan, L. A., & Krishnan, L. A. (1992). The influence of talker differences on vowel identification by normal-hearing and hearing-impaired Listeners. The Journal of the Acoustical Society of America, 92(3), 1228-1246. doi:10.1121/1.403973
    • (1992) The Journal of the Acoustical Society of America , vol.92 , Issue.3 , pp. 1228-1246
    • Nabelek, A.K.1    Czyzewski, Z.2    Krishnan, L.A.3    Krishnan, L.A.4
  • 31
    • 0013871855 scopus 로고
    • Coarticulation in vcv utterances: Spectrographic measurements
    • doi:10.1121/1.1909864
    • Öhman, S. E. G. (1966). Coarticulation in VCV Utterances: Spectrographic Measurements. The Journal of the Acoustical Society of America, 39(1), 151-168. doi:10.1121/1.1909864
    • (1966) The Journal of the Acoustical Society of America , vol.39 , Issue.1 , pp. 151-168
    • Öhman, S.E.G.1
  • 34
    • 84878395202 scopus 로고    scopus 로고
    • Data-driven approach to rapid prototyping xhosa speech synthesis
    • Roux, J. C., & Visagie, A. S. (2007). Data-driven approach to rapid prototyping Xhosa speech synthesis. SSW6-2007, 143-147.
    • (2007) SSW6-2007 , pp. 143-147
    • Roux, J.C.1    Visagie, A.S.2
  • 35
    • 0023756465 scopus 로고
    • Speech synthesis by rule using an optimal selection of non-uniform synthesis units
    • Sagisaka, Y. (1988). Speech synthesis by rule using an optimal selection of non-uniform synthesis units. IEEE ICASSP1988, 679-682.
    • (1988) IEEE ICASSP1988 , pp. 679-682
    • Sagisaka, Y.1
  • 37
    • 84964193368 scopus 로고
    • Segment inventories for speech synthesis
    • Sivertsen, E. (1961). Segment inventories for speech synthesis. Language and Speech, 4(1), 27-90.
    • (1961) Language and Speech , vol.4 , Issue.1 , pp. 27-90
    • Sivertsen, E.1
  • 38
    • 84912906590 scopus 로고
    • Constraints among parameters simplify control of klatt formant synthesizer
    • Stevens, K. N., & Bickley, C. A. (1991). Constraints among parameters simplify control of Klatt formant synthesizer. Journal of Phonetics, 19, 161-174.
    • (1991) Journal of Phonetics , vol.19 , pp. 161-174
    • Stevens, K.N.1    Bickley, C.A.2
  • 39
    • 84955022381 scopus 로고
    • Development of a quantitative description of vowel articulation
    • doi:10.1121/1.1907943
    • Stevens, K. N., & House, A. S. (1955). Development of a quantitative description of vowel articulation. The Journal of the Acoustical Society of America, 27(3), 484-493. doi:10.1121/1.1907943
    • (1955) The Journal of the Acoustical Society of America , vol.27 , Issue.3 , pp. 484-493
    • Stevens, K.N.1    House, A.S.2
  • 40
    • 0003058857 scopus 로고
    • On the basic scheme and algorithms in non-uniform unit speech synthesis
    • In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Amsterdam, The Netherlands: North-Holland Publishing Co
    • Takeda, K., Abe, K., & Sagisaka, Y. (1992). On the basic scheme and algorithms in non-uniform unit speech synthesis. In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Talking machines: Theories, models, and designs (pp. 93-105). Amsterdam, The Netherlands: North-Holland Publishing Co.
    • (1992) Talking Machines: Theories, Models, and Designs , pp. 93-105
    • Takeda, K.1    Abe, K.2    Sagisaka, Y.3
  • 41
    • 6344264628 scopus 로고
    • Deriving text-to-speech durations from natural speech
    • In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Amsterdam, The Netherlands: North-Holland Publishing Co
    • van Santen, J. P. H. (1992). Deriving text-to-speech durations from natural speech. In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Talking machines: Theories, models, and designs (pp. 275-285). Amsterdam, The Netherlands: North-Holland Publishing Co.
    • (1992) Talking Machines: Theories, Models, and Designs , pp. 275-285
    • van Santen, J.P.H.1
  • 45
    • 0002489485 scopus 로고
    • Context-sensitive coding associative memory and serial order in (speech) behavior
    • doi:10.1037/h0026823
    • Wicklegran, W. A. (1969). Context-sensitive coding associative memory and serial order in (speech) behavior. Psychological Review, 76, 1-15. doi:10.1037/h0026823
    • (1969) Psychological Review , vol.76 , pp. 1-15
    • Wicklegran, W.A.1
  • 46
    • 0348153016 scopus 로고
    • Robust automatic extraction of diphones with variable boundaries
    • Yarrington, D., Bunnell, H. T., & Ball, G. (1995). Robust automatic extraction of diphones with variable boundaries. EUROSPEECH, 95, 1845-1848.
    • (1995) EUROSPEECH , vol.95 , pp. 1845-1848
    • Yarrington, D.1    Bunnell, H.T.2    Ball, G.3
  • 47
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • doi:10.1016/j.specom.2009.04.004
    • Zen, H., Tokuda, K., & Black, A. W. (2009). Statistical parametric speech synthesis. Speech Communication, 51(11), 1039-1064. doi:10.1016/j.specom.2009.04.004
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 48
    • 33749573927 scopus 로고    scopus 로고
    • Reformulating the hmm as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
    • doi:10.1016/j.csl.2006.01.002
    • Zen, H., Tokuda, K., & Kitamura, T. (2007). Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences. Computer Speech & Language, 21(1), 153-173. doi:10.1016/j.csl.2006.01.002
    • (2007) Computer Speech & Language , vol.21 , Issue.1 , pp. 153-173
    • Zen, H.1    Tokuda, K.2    Kitamura, T.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.