메뉴 건너뛰기




Volumn 132, Issue 6, 2012, Pages 3980-3989

A procedure for estimating gestural scores from speech acoustics

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTIC DISTANCE; ANALYSIS BY SYNTHESIS; DATA SETS; ITERATIVE APPROACH; NATURAL SPEECH; SPEECH ACOUSTICS; TEMPORAL PATTERNING; VOCAL-TRACTS;

EID: 84870896880     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.4763545     Document Type: Article
Times cited : (33)

References (48)
  • 1
    • 0020602364 scopus 로고
    • Efficient coding of LPC parameters by temporal decomposition
    • Boston, MA
    • Atal, B. S. (1983). Efficient coding of LPC parameters by temporal decomposition., in Proceedings of ICASSP, Boston, MA, pp. 81-84.
    • (1983) Proceedings of ICASSP , pp. 81-84
    • Atal, B.S.1
  • 2
    • 0027024362 scopus 로고
    • Articulatory phonology: An overview
    • 10.1159/000261913
    • Browman, C., and Goldstein, L. (1992). Articulatory phonology: An overview., Phonetica 49, 155-180. 10.1159/000261913
    • (1992) Phonetica , vol.49 , pp. 155-180
    • Browman, C.1    Goldstein, L.2
  • 3
    • 84955535347 scopus 로고
    • Gestural specification using dynamically-defined articulatory structures
    • Browman, C. P., and Goldstein, L. (1990). Gestural specification using dynamically-defined articulatory structures., J. Phonetics 18 (3), 299-320.
    • (1990) J. Phonetics , vol.18 , Issue.3 , pp. 299-320
    • Browman, C.P.1    Goldstein, L.2
  • 4
    • 0002956896 scopus 로고
    • Gestural syllable position effects in American English
    • edited by F. Bell-Berti and L. J. Raphael (AIP Press, Woodbury, NY)
    • Browman, C. P., and Goldstein, L. (1995). Gestural syllable position effects in American English., in Producing Speech: Contemporary Issues (for Katherine Safford Harris), edited by, F. Bell-Berti, and, L. J. Raphael, (AIP Press, Woodbury, NY), pp. 19-33.
    • (1995) Producing Speech: Contemporary Issues (For Katherine Safford Harris) , pp. 19-33
    • Browman, C.P.1    Goldstein, L.2
  • 5
    • 0011499143 scopus 로고
    • C-centers revisited
    • 10.1159/000262183
    • Byrd, D. (1995). C-centers revisited., Phonetica 52, 285-306. 10.1159/000262183
    • (1995) Phonetica , vol.52 , pp. 285-306
    • Byrd, D.1
  • 6
    • 0032039952 scopus 로고    scopus 로고
    • Intragestural dynamics of multiple phrasal boundaries
    • 10.1006/jpho.1998.0071
    • Byrd, D., and Saltzman, E. (1998). Intragestural dynamics of multiple phrasal boundaries., J. Phonetics 26, 173-199. 10.1006/jpho.1998.0071
    • (1998) J. Phonetics , vol.26 , pp. 173-199
    • Byrd, D.1    Saltzman, E.2
  • 7
    • 0037949203 scopus 로고    scopus 로고
    • The elastic phrase: Modeling the dynamics of boundary-adjacent lengthening
    • 10.1016/S0095-4470(02)00085-2
    • Byrd, D., and Saltzman, E. (2003). The elastic phrase: Modeling the dynamics of boundary-adjacent lengthening., J. Phonetics 31 (2), 149-180. 10.1016/S0095-4470(02)00085-2
    • (2003) J. Phonetics , vol.31 , Issue.2 , pp. 149-180
    • Byrd, D.1    Saltzman, E.2
  • 8
    • 60649116249 scopus 로고    scopus 로고
    • Timing effects of syllable structure and stress on nasals: A real-time MRI examination
    • 10.1016/j.wocn.2008.10.002
    • Byrd, D., Tobin, S., Bresch, E., and Narayanan, S. (2009). Timing effects of syllable structure and stress on nasals: A real-time MRI examination., J. Phonetics 37, 97-110. 10.1016/j.wocn.2008.10.002
    • (2009) J. Phonetics , vol.37 , pp. 97-110
    • Byrd, D.1    Tobin, S.2    Bresch, E.3    Narayanan, S.4
  • 9
    • 20444417436 scopus 로고    scopus 로고
    • Prosodic strengthening and featural enhancement: Evidence from acoustic and articulatory realizations of /,i/ in English
    • 10.1121/1.1861893
    • Cho, T. (2005). Prosodic strengthening and featural enhancement: Evidence from acoustic and articulatory realizations of /,i/ in English., J. Acoust. Soc. Am. 117 (6), 3867-3878. 10.1121/1.1861893
    • (2005) J. Acoust. Soc. Am. , vol.117 , Issue.6 , pp. 3867-3878
    • Cho, T.1
  • 10
    • 34249884665 scopus 로고    scopus 로고
    • Manifestation of prosodic structure in articulatory variation: Evidence from lip movement kinematics in English
    • edited by L. Goldstein, D. H. Whalen, and C. Best (Walter de Gruyter, Berlin)
    • Cho, T. (2006). Manifestation of prosodic structure in articulatory variation: Evidence from lip movement kinematics in English., in Laboratory Phonology 8 (Phonology and Phonetics), edited by, L. Goldstein, D. H. Whalen, and, C. Best, (Walter de Gruyter, Berlin), pp. 519-548.
    • (2006) Laboratory Phonology 8 (Phonology and Phonetics) , pp. 519-548
    • Cho, T.1
  • 11
    • 0028795574 scopus 로고
    • The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation
    • 10.1121/1.412275
    • de Jong, K. J. (1995). The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation., J. Acoust. Soc. Am. 97 (1), 491-504. 10.1121/1.412275
    • (1995) J. Acoust. Soc. Am. , vol.97 , Issue.1 , pp. 491-504
    • De Jong, K.J.1
  • 12
    • 0031009252 scopus 로고    scopus 로고
    • Articulatory strengthening at edges of prosodic domains
    • 10.1121/1.418332
    • Fougeron, C., and Keating, P. A. (1997). Articulatory strengthening at edges of prosodic domains., J. Acoust. Soc. Am. 101, 3728-3740. 10.1121/1.418332
    • (1997) J. Acoust. Soc. Am. , vol.101 , pp. 3728-3740
    • Fougeron, C.1    Keating, P.A.2
  • 13
    • 0001109477 scopus 로고
    • Coarticulation and theories of extrinsic timing
    • Fowler, C. (1980). Coarticulation and theories of extrinsic timing., J. Phonetics 8, 113-133.
    • (1980) J. Phonetics , vol.8 , pp. 113-133
    • Fowler, C.1
  • 14
    • 0022548705 scopus 로고
    • On the role of speech transition for speech perception
    • 10.1121/1.393842
    • Furui, S. (1986). On the role of speech transition for speech perception., J. Acoust. Soc. Am. 80 (4), 1016-1025. 10.1121/1.393842
    • (1986) J. Acoust. Soc. Am. , vol.80 , Issue.4 , pp. 1016-1025
    • Furui, S.1
  • 15
    • 84928280004 scopus 로고    scopus 로고
    • The role of vocal tract gestural action units in understanding the evolution of phonology
    • edited by M. Arbib (Cambridge University, Cambridge)
    • Goldstein, L., Byrd, D., and Saltzman, E. (2006). The role of vocal tract gestural action units in understanding the evolution of phonology., in From Action to Language: The Mirror Neuron System, edited by, M. Arbib, (Cambridge University, Cambridge), pp. 215-249.
    • (2006) From Action to Language: The Mirror Neuron System , pp. 215-249
    • Goldstein, L.1    Byrd, D.2    Saltzman, E.3
  • 17
    • 0036711819 scopus 로고    scopus 로고
    • A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn
    • 10.1121/1.1498851
    • Hanson, H. M., and Stevens, K. N. (2002). A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn., J. Acoust. Soc. Am. 112 (3), 1158-1182. 10.1121/1.1498851
    • (2002) J. Acoust. Soc. Am. , vol.112 , Issue.3 , pp. 1158-1182
    • Hanson, H.M.1    Stevens, K.N.2
  • 19
    • 79959812754 scopus 로고    scopus 로고
    • FSM-based pronunciation modeling using articulatory phonological code
    • in
    • Hu, C., Zhuang, X., and Hasegawa-Johnson, M. (2010). FSM-based pronunciation modeling using articulatory phonological code., in Proceedings of Interspeech, pp. 2274-2277.
    • (2010) Proceedings of Interspeech , pp. 2274-2277
    • Hu, C.1    Zhuang, X.2    Hasegawa-Johnson, M.3
  • 20
    • 38849200444 scopus 로고    scopus 로고
    • Probabilistic landmark detection for automatic speech recognition using acoustic-phonetic information
    • 10.1121/1.2823754
    • Juneja, A., and Espy-Wilson, C. (2008). Probabilistic landmark detection for automatic speech recognition using acoustic-phonetic information., J. Acoust. Soc. Am. 123 (2), 1154-1168. 10.1121/1.2823754
    • (2008) J. Acoust. Soc. Am. , vol.123 , Issue.2 , pp. 1154-1168
    • Juneja, A.1    Espy-Wilson, C.2
  • 21
    • 0029753859 scopus 로고    scopus 로고
    • Deriving gestural scores from articulator-movement records using weighted temporal decomposition
    • 10.1109/TSA.1996.481448
    • Jung, T. P., Krishnamurthy, A. K., Ahalt, S. C., Beckman, M. E., and Lee, S. H. (1996). Deriving gestural scores from articulator-movement records using weighted temporal decomposition., IEEE Trans. Speech Audio Process. 4 (1), 2-18. 10.1109/TSA.1996.481448
    • (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.1 , pp. 2-18
    • Jung, T.P.1    Krishnamurthy, A.K.2    Ahalt, S.C.3    Beckman, M.E.4    Lee, S.H.5
  • 22
    • 84953656527 scopus 로고
    • From general to language-specific capacities: The WRAPSA model of how speech perception develops
    • Jusczyk, P. (1993). From general to language-specific capacities: The WRAPSA model of how speech perception develops., J. Phonetics 21, 3-28.
    • (1993) J. Phonetics , vol.21 , pp. 3-28
    • Jusczyk, P.1
  • 23
    • 33745746741 scopus 로고    scopus 로고
    • Syllable position effects and gestural organization: Articulatory evidence from Russian
    • edited by L. Goldstein, D. Whalen, and C. Best (Mouton deGruyter, Berlin)
    • Kochetov, A. (2006). Syllable position effects and gestural organization: Articulatory evidence from Russian., in Papers in Laboratory Phonology 8, edited by, L. Goldstein, D. Whalen, and, C. Best, (Mouton deGruyter, Berlin), pp. 565-588.
    • (2006) Papers in Laboratory Phonology 8 , pp. 565-588
    • Kochetov, A.1
  • 24
    • 0002635113 scopus 로고    scopus 로고
    • Physiological organization of syllables: A review
    • 10.1006/jpho.1999.0089
    • Krakow, R. A. (1999). Physiological organization of syllables: A review., J. Phonetics 27, 23-54. 10.1006/jpho.1999.0089
    • (1999) J. Phonetics , vol.27 , pp. 23-54
    • Krakow, R.A.1
  • 26
    • 0016567060 scopus 로고
    • Automatic segmentation of speech into syllabic units
    • 10.1121/1.380738
    • Mermelstein, P. (1975). Automatic segmentation of speech into syllabic units., J. Acoust. Soc. Am. 58, 880-883. 10.1121/1.380738
    • (1975) J. Acoust. Soc. Am. , vol.58 , pp. 880-883
    • Mermelstein, P.1
  • 27
    • 78649390043 scopus 로고    scopus 로고
    • Retrieving tract variables from acoustics: A comparison of different machine learning strategies
    • 10.1109/JSTSP.2010.2076013
    • Mitra, V., Nam, H., Espy-Wilson, C., Saltzman, E., and Goldstein, L. (2010a). Retrieving tract variables from acoustics: A comparison of different machine learning strategies., IEEE J. Sel. Top. Signal Process. 4 (6), 1027-1045. 10.1109/JSTSP.2010.2076013
    • (2010) IEEE J. Sel. Top. Signal Process. , vol.4 , Issue.6 , pp. 1027-1045
    • Mitra, V.1    Nam, H.2    Espy-Wilson, C.3    Saltzman, E.4    Goldstein, L.5
  • 28
    • 79959813685 scopus 로고    scopus 로고
    • Robust word recognition using articulatory trajectories and gestures
    • Makuhari, Japan
    • Mitra, V., Nam, H., Espy-Wilson, C., Saltzman, E., and Goldstein, L. (2010b). Robust word recognition using articulatory trajectories and gestures., in Proceedings of Interspeech, Makuhari, Japan, pp. 2038-2041.
    • (2010) Proceedings of Interspeech , pp. 2038-2041
    • Mitra, V.1    Nam, H.2    Espy-Wilson, C.3    Saltzman, E.4    Goldstein, L.5
  • 29
    • 80051649631 scopus 로고    scopus 로고
    • Gesture-based dynamic Bayesian network for noise robust speech recognition
    • Prague, Czech Republic
    • Mitra, V., Nam, H., Espy-Wilson, C., Saltzman, E., and Goldstein, L. (2011). Gesture-based dynamic Bayesian network for noise robust speech recognition., in Proceedings of ICASSP, Prague, Czech Republic, pp. 5172-5175.
    • (2011) Proceedings of ICASSP , pp. 5172-5175
    • Mitra, V.1    Nam, H.2    Espy-Wilson, C.3    Saltzman, E.4    Goldstein, L.5
  • 30
    • 84859977994 scopus 로고    scopus 로고
    • Temporal planning in speech: Syllable structure as coupling graph
    • 10.1016/j.wocn.2012.02.002
    • Mooshammer, C., Goldstein, L., Nam, H., McClure, S., Saltzman, E., and Tiede, M. (2012). Temporal planning in speech: Syllable structure as coupling graph., J. Phonetics 40 (3), 374-389. 10.1016/j.wocn.2012.02.002
    • (2012) J. Phonetics , vol.40 , Issue.3 , pp. 374-389
    • Mooshammer, C.1    Goldstein, L.2    Nam, H.3    McClure, S.4    Saltzman, E.5    Tiede, M.6
  • 31
    • 39149100388 scopus 로고    scopus 로고
    • Syllable-level intergestural timing model: Split-gesture dynamics focusing on positional asymmetry and moraic structure
    • edited by J. Cole and J. I. Hualde (Walter de Gruyter, Berlin)
    • Nam, H. (2007). Syllable-level intergestural timing model: Split-gesture dynamics focusing on positional asymmetry and moraic structure., in Laboratory Phonology 9 (Phonology and Phonetics), edited by, J. Cole, and, J. I. Hualde, (Walter de Gruyter, Berlin), pp. 483-506.
    • (2007) Laboratory Phonology 9 (Phonology and Phonetics) , pp. 483-506
    • Nam, H.1
  • 32
    • 70349207706 scopus 로고    scopus 로고
    • TADA: An enhanced, portable task dynamics model in Matlab
    • Nam, H., Goldstein, L., Saltzman, E., and Byrd, D. (2004). TADA: An enhanced, portable task dynamics model in Matlab., J. Acoust. Soc. Am. 115 (5), 2430.
    • (2004) J. Acoust. Soc. Am. , vol.115 , Issue.5 , pp. 2430
    • Nam, H.1    Goldstein, L.2    Saltzman, E.3    Byrd, D.4
  • 34
    • 78049363406 scopus 로고    scopus 로고
    • Reconstructing the full tongue contour from EMA/X-Ray microbeam
    • in
    • Qin, C., and Carreira-Perpiñán, M. (2010). Reconstructing the full tongue contour from EMA/X-Ray microbeam., in Proceedings of ICASSP, pp. 4190-4193.
    • (2010) Proceedings of ICASSP , pp. 4190-4193
    • Qin, C.1    Carreira-Perpiñán, M.2
  • 36
    • 0017930815 scopus 로고
    • Dynamic programming algorithm optimization for spoken word recognition
    • 10.1109/TASSP.1978.1163055
    • Sakoe, H., and Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition., IEEE Trans. Acoust., Speech, Signal Process. 26 (1), 43-49. 10.1109/TASSP.1978.1163055
    • (1978) IEEE Trans. Acoust., Speech, Signal Process. , vol.26 , Issue.1 , pp. 43-49
    • Sakoe, H.1    Chiba, S.2
  • 37
    • 77956779481 scopus 로고
    • A dynamical approach to gestural patterning in speech production
    • 10.1207/s15326969eco0104-2
    • Saltzman, E., and Munhall, K. (1989). A dynamical approach to gestural patterning in speech production., Ecological Psychol. 1 (4), 332-382. 10.1207/s15326969eco0104-2
    • (1989) Ecological Psychol. , vol.1 , Issue.4 , pp. 332-382
    • Saltzman, E.1    Munhall, K.2
  • 38
    • 84902687217 scopus 로고    scopus 로고
    • A task-dynamic toolkit for modeling the effects of prosodic structure on articulation
    • Campinas, Brazil
    • Saltzman, E., Nam, H., Krivokapic, J., and Goldstein, L. (2008). A task-dynamic toolkit for modeling the effects of prosodic structure on articulation., in Proceedings of Speech Prosody, Campinas, Brazil, pp. 175-184.
    • (2008) Proceedings of Speech Prosody , pp. 175-184
    • Saltzman, E.1    Nam, H.2    Krivokapic, J.3    Goldstein, L.4
  • 39
    • 84933380756 scopus 로고
    • A simplified derivation of linear least square smoothing and prediction theory
    • 10.1109/JRPROC.1950.231821
    • Shannon, C. E., and Bode, H. (1950). A simplified derivation of linear least square smoothing and prediction theory., Proc. IRE 38, 417-425. 10.1109/JRPROC.1950.231821
    • (1950) Proc. IRE , vol.38 , pp. 417-425
    • Shannon, C.E.1    Bode, H.2
  • 40
    • 0003806548 scopus 로고
    • Evidence for the role of acoustic boundaries in the perception of speech sounds
    • edited by V. A. Fromkin (Academic Press, Orlando, FL)
    • Stevens, K. N. (1985). Evidence for the role of acoustic boundaries in the perception of speech sounds., in Phonetic Linguistics: Essays in Honor of Peter Ladefoged, edited by, V. A. Fromkin, (Academic Press, Orlando, FL), pp. 243-255.
    • (1985) Phonetic Linguistics: Essays in Honor of Peter Ladefoged , pp. 243-255
    • Stevens, K.N.1
  • 41
    • 85135109310 scopus 로고
    • Implementation of a model for lexical access based on features
    • International Speech Communication Association, Banff, Alberta
    • Stevens, K. N., Manuel, S. Y., Shattuck-Hufnagel, S., and Liu, S. (1992). Implementation of a model for lexical access based on features., in International Conference on Spoken Language Processing, International Speech Communication Association, Banff, Alberta, pp. 499-502.
    • (1992) International Conference on Spoken Language Processing , pp. 499-502
    • Stevens, K.N.1    Manuel, S.Y.2    Shattuck-Hufnagel, S.3    Liu, S.4
  • 42
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition
    • 10.1121/1.1420380
    • Sun, J. P., and Deng, L. (2002). An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition., J. Acoust. Soc. Am. 111 (2), 1086-1101. 10.1121/1.1420380
    • (2002) J. Acoust. Soc. Am. , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.P.1    Deng, L.2
  • 43
    • 78649348268 scopus 로고    scopus 로고
    • Annotation and use of speech production corpus for building language universal speech recognizers
    • Beijing, China, Vol
    • Sun, J. P., Jing, X., and Deng, L. (2000). Annotation and use of speech production corpus for building language universal speech recognizers., in Proceedings of International Symposium on Chinese Spoken Language Processing, Beijing, China, Vol. 3, pp. 31-34.
    • (2000) Proceedings of International Symposium on Chinese Spoken Language Processing , vol.3 , pp. 31-34
    • Sun, J.P.1    Jing, X.2    Deng, L.3
  • 45
    • 0030642436 scopus 로고    scopus 로고
    • The domain of accentual lengthening in American English
    • 10.1006/jpho.1996.0032
    • Turk, A. E., and Sawusch, J. R. (1997). The domain of accentual lengthening in American English., J. Phonetics 25, 25-41. 10.1006/jpho.1996.0032
    • (1997) J. Phonetics , vol.25 , pp. 25-41
    • Turk, A.E.1    Sawusch, J.R.2
  • 47
    • 84858956763 scopus 로고    scopus 로고
    • Speaker identification on the SCOTUS corpus
    • 10.1121/1.2935783
    • Yuan, J., and Liberman, M. (2008). Speaker identification on the SCOTUS corpus., J. Acoust. Soc. Am. 123 (5), 3878. 10.1121/1.2935783
    • (2008) J. Acoust. Soc. Am. , vol.123 , Issue.5 , pp. 3878
    • Yuan, J.1    Liberman, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.