메뉴 건너뛰기




Volumn 53, Issue 3, 2011, Pages 311-326

Listeners' weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis

Author keywords

Acoustic cue weighting; Evaluation; Multidimensional scaling; Speech perception; Speech synthesis

Indexed keywords

ACOUSTIC CHARACTERISTIC; ACOUSTIC CUES; CUE WEIGHTING; EVALUATION; EVALUATION METHOD; MULTIDIMENSIONAL SCALING; MULTIDIMENSIONAL SCALING ANALYSIS; MULTIDIMENSIONAL SCALING TECHNIQUES; ONE DIMENSION; PAIR-WISE COMPARISON; PERCEPTUAL EVALUATION; SPEECH PERCEPTION; SPEECH SYNTHESIS SYSTEM; SYNTHETIC SPEECH; SYNTHETIC SPEECH NATURALNESS;

EID: 79551495380     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2010.10.003     Document Type: Article
Times cited : (31)

References (60)
  • 1
    • 0030825818 scopus 로고    scopus 로고
    • Multidimensional scaling of complex sounds by school-aged children and adults
    • P. Allen, and C. Bond Multidimensional scaling of complex sounds by school-aged children and adults J. Acoust. Soc. Amer. 102 4 1997 2255 2263
    • (1997) J. Acoust. Soc. Amer. , vol.102 , Issue.4 , pp. 2255-2263
    • Allen, P.1    Bond, C.2
  • 2
    • 0036303925 scopus 로고    scopus 로고
    • Stimulus set effects in the similarity ratings of unfamiliar complex sounds
    • P. Allen, and S. Scollie Stimulus set effects in the similarity ratings of unfamiliar complex sounds J. Acoust. Soc. Amer. 112 1 2002 211 218
    • (2002) J. Acoust. Soc. Amer. , vol.112 , Issue.1 , pp. 211-218
    • Allen, P.1    Scollie, S.2
  • 4
    • 0019536668 scopus 로고
    • Perceptual equivalence of acoustic cues in speech and non-speech perception
    • C.T. Best, B. Morrongiello, and R. Robson Perceptual equivalence of acoustic cues in speech and non-speech perception Percept. Psychophys. 29 3 1981 191 211
    • (1981) Percept. Psychophys. , vol.29 , Issue.3 , pp. 191-211
    • Best, C.T.1    Morrongiello, B.2    Robson, R.3
  • 6
    • 0033070303 scopus 로고    scopus 로고
    • Effects of talker, rate, and amplitude variation on recognition memory for spoken words
    • A.R. Bradlow, L.C. Nygaard, and D.B. Pisoni Effects of talker, rate, and amplitude variation on recognition memory for spoken words Percept. Psychophys. 61 1999 206 219
    • (1999) Percept. Psychophys. , vol.61 , pp. 206-219
    • Bradlow, A.R.1    Nygaard, L.C.2    Pisoni, D.B.3
  • 7
    • 84864626246 scopus 로고    scopus 로고
    • An evaluation of synthetic speech using the PESQ measure
    • Budapest
    • Cerňak, M.; Rusko, M.; 2005. An evaluation of synthetic speech using the PESQ measure. In: Proc. Forum Acousticum, Budapest, pp. 2725-2728.
    • (2005) Proc. Forum Acousticum , pp. 2725-2728
    • Cerňak, M.1    Rusko, M.2
  • 10
    • 0030770940 scopus 로고    scopus 로고
    • Identification of multidimensional stimuli containing speech cues and the effects of training
    • L.A. Christensen, and L.E. Humes Identification of multidimensional stimuli containing speech cues and the effects of training J. Acoust. Soc. Amer. 102 4 1997 2297 2310
    • (1997) J. Acoust. Soc. Amer. , vol.102 , Issue.4 , pp. 2297-2310
    • Christensen, L.A.1    Humes, L.E.2
  • 11
    • 79551504711 scopus 로고    scopus 로고
    • Modelling pitch accents for concept-to-speech synthesis
    • Barcelona, Spain
    • Clark, R.A.J.; 2003. Modelling pitch accents for concept-to-speech synthesis. In: Internat. Congress of Phonetic Sciences, Barcelona, Spain, pp. 1141-1144.
    • (2003) Internat. Congress of Phonetic Sciences , pp. 1141-1144
    • Clark, R.A.J.1
  • 13
    • 34047123652 scopus 로고    scopus 로고
    • Multisyn: Open-domain unit selection for the Festival speech synthesis system
    • R.A.J. Clark, K. Richmond, and S. King Multisyn: Open-domain unit selection for the Festival speech synthesis system Speech Commun. 49 4 2007 317 330
    • (2007) Speech Commun. , vol.49 , Issue.4 , pp. 317-330
    • Clark, R.A.J.1    Richmond, K.2    King, S.3
  • 14
    • 0000904893 scopus 로고
    • Mora or phoneme? Further evidence for language-specific listening
    • A. Cutler, and T. Otake Mora or phoneme? Further evidence for language-specific listening J. Memory Lang. 33 1994 824 844
    • (1994) J. Memory Lang. , vol.33 , pp. 824-844
    • Cutler, A.1    Otake, T.2
  • 16
    • 79551503541 scopus 로고    scopus 로고
    • Improving instrumental quality prediction performance for the Blizzard Challenge
    • Brisbane, Australia
    • Falk, T.H.; Moeller, S.; Karaiskos, V.; King, S.; 2008. Improving instrumental quality prediction performance for the Blizzard Challenge. In: Proc. Blizzard Workshop, Brisbane, Australia.
    • (2008) Proc. Blizzard Workshop
    • Falk T., .H.1    Moeller, S.2    Karaiskos, V.3    King, S.4
  • 18
    • 49249100734 scopus 로고    scopus 로고
    • Cue-specific effects of categorization training on the relative weighting of acoustic cues to consonant voicing in English
    • A.L. Francis, N. Kaganovich, and C. Driscoll-Huber Cue-specific effects of categorization training on the relative weighting of acoustic cues to consonant voicing in English J. Acoust. Soc. Amer. 124 2008 1234 1251
    • (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1234-1251
    • Francis, A.L.1    Kaganovich, N.2    Driscoll-Huber, C.3
  • 20
    • 0027351117 scopus 로고
    • Attentional modulation of the phonetic significance of acoustic cues
    • P.C. Gordon, J.L. Eberhardt, and J.G. Rueckl Attentional modulation of the phonetic significance of acoustic cues Cogn. Psychol. 25 1993 1 42
    • (1993) Cogn. Psychol. , vol.25 , pp. 1-42
    • Gordon, P.C.1    Eberhardt, J.L.2    Rueckl, J.G.3
  • 21
    • 0034794134 scopus 로고    scopus 로고
    • Application of multidimensional scaling to subjective evaluation of coded speech
    • J.L. Hall Application of multidimensional scaling to subjective evaluation of coded speech J. Acoust. Soc. Amer. 110 2001 2167 2182
    • (2001) J. Acoust. Soc. Amer. , vol.110 , pp. 2167-2182
    • Hall, J.L.1
  • 22
    • 0001244975 scopus 로고    scopus 로고
    • The development of phonemic categorisation in children aged 6-12
    • V. Hazan, and S. Barrett The development of phonemic categorisation in children aged 6-12 J. Phonetics 28 2000 377 396
    • (2000) J. Phonetics , vol.28 , pp. 377-396
    • Hazan, V.1    Barrett, S.2
  • 23
    • 0032090027 scopus 로고    scopus 로고
    • The effect of cue-enhancement on the intelligibility of nonsense word and sentence materials presented in noise
    • V. Hazan, and A. Simpson The effect of cue-enhancement on the intelligibility of nonsense word and sentence materials presented in noise Speech Commun. 24 1998 211 226
    • (1998) Speech Commun. , vol.24 , pp. 211-226
    • Hazan, V.1    Simpson, A.2
  • 24
    • 33645486754 scopus 로고    scopus 로고
    • Enhancement techniques to improve the intelligibility of consonants in noise: Speaker and listener effects
    • Sydney, Australia
    • Hazan, V.; Simpson, A.; Huckvale, M.; 1998. Enhancement techniques to improve the intelligibility of consonants in noise: Speaker and listener effects. In: ICSLP, Sydney, Australia, pp. 2163-2167.
    • (1998) ICSLP , pp. 2163-2167
    • Hazan, V.1    Simpson, A.2    Huckvale, M.3
  • 25
    • 84966352089 scopus 로고    scopus 로고
    • Comparison of subjective evaluation and an objective evaluation metric for prosody in text-to-speech synthesis
    • Jenolan Caves, Blue Mountians, Australia
    • Hirst, D.; Rilliard, A.; Aubergé, V.; 1998. Comparison of subjective evaluation and an objective evaluation metric for prosody in text-to-speech synthesis. In: Proc. ESCA/COCOSDA Workshop on Speech Synthesis'98, Jenolan Caves, Blue Mountians, Australia.
    • (1998) Proc. ESCA/COCOSDA Workshop on Speech Synthesis'98
    • Hirst, D.1    Rilliard, A.2    Aubergé, V.3
  • 26
    • 0012392720 scopus 로고
    • A method for subjective performance assessment of the quality of speech output devices
    • ITU-T Recommendation P.85
    • ITU-T Recommendation P.85, 1994. A method for subjective performance assessment of the quality of speech output devices. International Telecommunications Union publication.
    • (1994) International Telecommunications Union Publication
  • 28
    • 27744591319 scopus 로고    scopus 로고
    • Phonetic training with acoustic cue manipulations: A comparison of methods for teaching English /r/-/l/ to Japanese adults
    • P. Iverson, V. Hazan, and K. Bannister Phonetic training with acoustic cue manipulations: A comparison of methods for teaching English /r/-/l/ to Japanese adults J. Acoust. Soc. Amer. 118 2005 3267 3278
    • (2005) J. Acoust. Soc. Amer. , vol.118 , pp. 3267-3278
    • Iverson, P.1    Hazan, V.2    Bannister, K.3
  • 29
    • 33745236124 scopus 로고    scopus 로고
    • Exploration of different types of intonational deviations in foreign-accented and synthesized speech
    • Lisbon, Portugal
    • Jilka, M.; 2005. Exploration of different types of intonational deviations in foreign-accented and synthesized speech. In: Proc. Interspeech-2005, Lisbon, Portugal, pp. 2393-2396.
    • (2005) Proc. Interspeech-2005 , pp. 2393-2396
    • Jilka, M.1
  • 34
    • 0031718672 scopus 로고    scopus 로고
    • Validity of rating scale measures of voice quality
    • J. Kreiman, and B.R. Gerratt Validity of rating scale measures of voice quality J. Acoust. Soc. Amer. 104 1998 1598 1608
    • (1998) J. Acoust. Soc. Amer. , vol.104 , pp. 1598-1608
    • Kreiman, J.1    Gerratt, B.R.2
  • 35
    • 0033770063 scopus 로고    scopus 로고
    • Sources of listener disagreement in voice quality assessment
    • J. Kreiman, and B.R. Gerratt Sources of listener disagreement in voice quality assessment J. Acoust. Soc. Amer. 108 2000 1867 1876
    • (2000) J. Acoust. Soc. Amer. , vol.108 , pp. 1867-1876
    • Kreiman, J.1    Gerratt, B.R.2
  • 36
    • 33745192875 scopus 로고    scopus 로고
    • Perceptual relevance of source spectral slope measures
    • J. Kreiman, and B.R. Gerratt Perceptual relevance of source spectral slope measures J. Acoust. Soc. Amer. 115 2004 2609
    • (2004) J. Acoust. Soc. Amer. , vol.115 , pp. 2609
    • Kreiman, J.1    Gerratt, B.R.2
  • 37
    • 34848903222 scopus 로고    scopus 로고
    • When and why listeners disagree in voice quality assessment
    • J. Kreiman, B.R. Gerratt, and M. Ito When and why listeners disagree in voice quality assessment J. Acoust. Soc. Amer. 122 4 2007 2354 2364
    • (2007) J. Acoust. Soc. Amer. , vol.122 , Issue.4 , pp. 2354-2364
    • Kreiman, J.1    Gerratt, B.R.2    Ito, M.3
  • 39
    • 70450133645 scopus 로고
    • Speech database development: Design and analysis of the acoustic-phonetic corpus
    • Noordwijkerhout, The Netherlands
    • Lamel, L.F.; Kassel, R.H.; Seneff, S.; 1989. Speech database development: Design and analysis of the acoustic-phonetic corpus. In: Proc. Speech I/O Assessment and Speech Databases, Noordwijkerhout, The Netherlands, pp. 2161-2170.
    • (1989) Proc. Speech I/O Assessment and Speech Databases , pp. 2161-2170
    • Lamel L., .F.1    Kassel R., .H.2    Seneff, S.3
  • 41
    • 2942744397 scopus 로고    scopus 로고
    • Adult-child differences in acoustic cue weighting are influenced by segmental context: Children are not always perceptually biased toward transitions
    • C. Mayo, and A. Turk Adult-child differences in acoustic cue weighting are influenced by segmental context: Children are not always perceptually biased toward transitions J. Acoust. Soc. Amer. 115 2004 3184 3194
    • (2004) J. Acoust. Soc. Amer. , vol.115 , pp. 3184-3194
    • Mayo, C.1    Turk, A.2
  • 42
    • 24944527359 scopus 로고    scopus 로고
    • The influence of spectral distinctiveness on acoustic cue weighting in children's and adults' speech perception
    • C. Mayo, and A. Turk The influence of spectral distinctiveness on acoustic cue weighting in children's and adults' speech perception J. Acoust. Soc. Amer. 118 2005 1730 1741
    • (2005) J. Acoust. Soc. Amer. , vol.118 , pp. 1730-1741
    • Mayo, C.1    Turk, A.2
  • 43
    • 44949218931 scopus 로고    scopus 로고
    • Multidimensional scaling of listener responses to synthetic speech
    • Lisbon, Portugal
    • Mayo, C.; Clark, R.A.J.; King, S.; 2005. Multidimensional scaling of listener responses to synthetic speech. In: Proc. Interspeech 2005, Lisbon, Portugal.
    • (2005) Proc. Interspeech 2005
    • Mayo, C.1    Clark R. .A., .J.2    King, S.3
  • 44
    • 84555207437 scopus 로고    scopus 로고
    • Quality prediction for synthesized speech: Comparison of approaches
    • Rotterdam
    • Möller, S.; Falk, T.H.; 2009. Quality prediction for synthesized speech: Comparison of approaches. In: Proc. NAG/DAGA 2009, Rotterdam, pp. 1168-1171.
    • (2009) Proc. NAG/DAGA 2009 , pp. 1168-1171
    • Möller, S.1    Falk T., .H.2
  • 45
    • 1842582666 scopus 로고    scopus 로고
    • The role of temporal and dynamic signal components in the perception of syllable-final stop voicing
    • S. Nittrouer The role of temporal and dynamic signal components in the perception of syllable-final stop voicing J. Acoust. Soc. Amer. 115 2004 1777 1790
    • (2004) J. Acoust. Soc. Amer. , vol.115 , pp. 1777-1790
    • Nittrouer, S.1
  • 46
    • 84947259106 scopus 로고    scopus 로고
    • Which is more important in a concatenative text to speech system - Pitch, duration, or spectral discontinuity?
    • Jenolan Caves, Blue Mountians, Australia
    • Plumpe, M.; Meredith, S.; 1998. Which is more important in a concatenative text to speech system - pitch, duration, or spectral discontinuity? In: Proc. ESCA/COCOSDA Workshop on Speech Synthesis'98, Jenolan Caves, Blue Mountians, Australia.
    • (1998) Proc. ESCA/COCOSDA Workshop on Speech Synthesis'98
    • Plumpe, M.1    Meredith, S.2
  • 47
    • 0028857561 scopus 로고
    • Comparing reliability of perceptual ratings of roughness and acoustic measures of jitter
    • C.R. Rabinov, J. Kreiman, B.R. Gerratt, and S. Bielamowicz Comparing reliability of perceptual ratings of roughness and acoustic measures of jitter J. Speech Hear. Res. 38 1995 26 32
    • (1995) J. Speech Hear. Res. , vol.38 , pp. 26-32
    • Rabinov, C.R.1    Kreiman, J.2    Gerratt, B.R.3    Bielamowicz, S.4
  • 51
    • 21844431547 scopus 로고    scopus 로고
    • Acceptability of variations in question intonation in natural and synthesised American English
    • A. Syrdal, and M. Jilka Acceptability of variations in question intonation in natural and synthesised American English J. Acoust. Soc. Amer. 115 5 2004 2543(A)
    • (2004) J. Acoust. Soc. Amer. , vol.115 , Issue.5
    • Syrdal, A.1    Jilka, M.2
  • 52
    • 33847292093 scopus 로고    scopus 로고
    • Acoustic segment durations in prosodic research: A practical guide
    • A. Turk, S. Nakai, and M. Sugahara Acoustic segment durations in prosodic research: A practical guide S. Sudhoff, D. Lenertova, R. Meyer, S. Pappert, P. Augurzky, I. Mleinek, N. Richter, J. Schliesser, Methods in Empirical Prosody Research 2006 De Gruyter Berlin 1 28
    • (2006) Methods in Empirical Prosody Research , pp. 1-28
    • Turk, A.1    Nakai, S.2    Sugahara, M.3
  • 55
    • 0000908449 scopus 로고
    • On the status of temporal cues to phonetic categories: Preceding vowel duration as a cue to voicing in final stop consonants
    • C. Wardrip-Fruin On the status of temporal cues to phonetic categories: Preceding vowel duration as a cue to voicing in final stop consonants J. Acoust. Soc. Amer. 71 1982 187 195
    • (1982) J. Acoust. Soc. Amer. , vol.71 , pp. 187-195
    • Wardrip-Fruin, C.1
  • 56
    • 0021812701 scopus 로고
    • The effect of signal degradation on the status of cues to voicing in utterance-final stop consonants
    • C. Wardrip-Fruin The effect of signal degradation on the status of cues to voicing in utterance-final stop consonants J. Acoust. Soc. Amer. 77 5 1985 1907 1912
    • (1985) J. Acoust. Soc. Amer. , vol.77 , Issue.5 , pp. 1907-1912
    • Wardrip-Fruin, C.1
  • 60
    • 33749573927 scopus 로고    scopus 로고
    • Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
    • H. Zen, K. Tokuda, and T. Kitamura Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences Comput. Speech Lang. 21 1 2007 153 173
    • (2007) Comput. Speech Lang. , vol.21 , Issue.1 , pp. 153-173
    • Zen, H.1    Tokuda, K.2    Kitamura, T.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.