메뉴 건너뛰기




Volumn 4, Issue 6, 2010, Pages 1027-1045

Retrieving tract variables from acoustics: A comparison of different machine learning strategies

Author keywords

Articulatory phonology; articulatory speech recognition (ASR); artificial neural networks (ANNs); coarticulation; distal supervised learning; mixture density networks; speech inversion; task dynamic and applications model; vocal tract variables

Indexed keywords

ARTICULATORY PHONOLOGY; ARTICULATORY SPEECH RECOGNITION (ASR); ARTIFICIAL NEURAL NETWORKS; CO-ARTICULATION; DISTAL SUPERVISED LEARNING; MIXTURE DENSITY NETWORKS; SPEECH INVERSION; TASK DYNAMIC AND APPLICATIONS MODEL; VOCAL-TRACTS;

EID: 78649390043     PISSN: 19324553     EISSN: None     Source Type: Journal    
DOI: 10.1109/JSTSP.2010.2076013     Document Type: Article
Times cited : (49)

References (123)
  • 1
    • 0017968519 scopus 로고
    • Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer sorting technique
    • B. S. Atal, J. J. Chang, M. V. Mathews, and J. W. Tukey, "Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer sorting technique,"J. Acoust. Soc. Amer., vol. 63, pp. 1535-1555, 1978.
    • (1978) J. Acoust. Soc. Amer. , vol.63 , pp. 1535-1555
    • Atal, B.S.1    Chang, J.J.2    Mathews, M.V.3    Tukey, J.W.4
  • 2
    • 0020602364 scopus 로고
    • Efficient coding of LPC parameters by temporal decomposition
    • B. S. Atal, "Efficient coding of LPC parameters by temporal decomposition," in Proc. ICASSP, 1983, pp. 81-84.
    • (1983) Proc. ICASSP , pp. 81-84
    • Atal, B.S.1
  • 3
    • 0004113976 scopus 로고    scopus 로고
    • Mixture density networks
    • Dept., Comput. Sci., Aston Univ., Birmingham, U.K., Tech. Rep. NCRG/4288
    • C. Bishop, "Mixture density networks," Neural Computing Research Group, Dept., Comput. Sci., Aston Univ., Birmingham, U.K., Tech. Rep. NCRG/4288.
    • Neural Computing Research Group
    • Bishop, C.1
  • 4
    • 0000523613 scopus 로고
    • Towards an articulatory phonology
    • C. P. Browman and L. Goldstein, "Towards an articulatory phonology," Phonol. Yearbook, vol. 85, pp. 219-252, 1986.
    • (1986) Phonol. Yearbook , vol.85 , pp. 219-252
    • Browman, C.P.1    Goldstein, L.2
  • 5
    • 0024150474 scopus 로고
    • Some notes on syllable structure in articulatory phonology
    • C. P. Browman and L. Goldstein, "Some notes on syllable structure in articulatory phonology," Phonetica, vol. 45, pp. 140-155, 1988.
    • (1988) Phonetica , vol.45 , pp. 140-155
    • Browman, C.P.1    Goldstein, L.2
  • 6
    • 84971737266 scopus 로고
    • Articulatory gestures as phonological units
    • C. P. Browman and L. Goldstein, "Articulatory gestures as phonological units," Phonol., vol. 6, pp. 201-251, 1989.
    • (1989) Phonol. , vol.6 , pp. 201-251
    • Browman, C.P.1    Goldstein, L.2
  • 7
    • 84955535347 scopus 로고
    • Gestural specification using dynamically-defined articulatory structures
    • C. P. Browman and L. Goldstein, "Gestural specification using dynamically-defined articulatory structures," J. Phonetics, vol. 18, no. 3, pp. 299-320, 1990.
    • (1990) J. Phonetics , vol.18 , Issue.3 , pp. 299-320
    • Browman, C.P.1    Goldstein, L.2
  • 8
    • 0006080506 scopus 로고
    • Representation and reality: Physical systems and phonological structure
    • C. P. Browman and L. Goldstein, "Representation and reality: Physical systems and phonological structure," J. Phonetics, vol. 18, pp. 411-424, 1990.
    • (1990) J. Phonetics , vol.18 , pp. 411-424
    • Browman, C.P.1    Goldstein, L.2
  • 9
    • 0001577222 scopus 로고
    • Tiers in articulatory phonology, with some implications for casual speech
    • J. Kingston and M. E. Beckman, Eds. Cambridge, U.K.: Cambridge Univ. Press
    • C. P.Browman and L. Goldstein, "Tiers in articulatory phonology, with some implications for casual speech," in Papers in Lab. Phon. I: Between the Grammar and the Physics of Speech, J. Kingston and M. E. Beckman, Eds. Cambridge, U.K.: Cambridge Univ. Press, 1991, pp. 341-376.
    • (1991) Papers in Lab. Phon. I: Between the Grammar and the Physics of Speech , pp. 341-376
    • Browman, C.P.1    Goldstein, L.2
  • 10
    • 0027024362 scopus 로고
    • Articulatory phonology: An overview
    • C. P. Browman and L. Goldstein, "Articulatory phonology: An overview," Phonetica, vol. 49, pp. 155-180, 1992.
    • (1992) Phonetica , vol.49 , pp. 155-180
    • Browman, C.P.1    Goldstein, L.2
  • 11
    • 0037949203 scopus 로고    scopus 로고
    • The elastic phrase: Modeling the dynamics of boundary-adjacent lengthening
    • D. Byrd and E. Saltzman, "The elastic phrase: Modeling the dynamics of boundary-adjacent lengthening," J. Phonetics, vol. 31, no. 2, pp. 149-180, 2003.
    • (2003) J. Phonetics , vol.31 , Issue.2 , pp. 149-180
    • Byrd, D.1    Saltzman, E.2
  • 13
    • 26444619785 scopus 로고    scopus 로고
    • An elitist approach to automatic articulatory-acoustic feature classification for phonetic characterization of spoken language
    • Nov.
    • S. Chang, M. Wester, and S. Greenberg, "An elitist approach to automatic articulatory-acoustic feature classification for phonetic characterization of spoken language," Speech Commun., vol. 47, no. 3, pp. 290-311, Nov. 2005.
    • (2005) Speech Commun. , vol.47 , Issue.3 , pp. 290-311
    • Chang, S.1    Wester, M.2    Greenberg, S.3
  • 14
    • 85009064164 scopus 로고    scopus 로고
    • Place of articulation cues for voiced and voiceless plosives and fricatives in syllable-initial position
    • S. Chen and A. Alwan, "Place of articulation cues for voiced and voiceless plosives and fricatives in syllable-initial position," in Proc. ICSLP, 2000, vol. 4, pp. 113-116.
    • (2000) Proc. ICSLP , vol.4 , pp. 113-116
    • Chen, S.1    Alwan, A.2
  • 17
    • 0000707529 scopus 로고
    • The internal organization of speech sounds
    • J. A. Goldsmith, Ed. Cambridge, U.K.: Blackwell
    • G. N. Clements and E. V. Hume, "The internal organization of speech sounds," in Handbook of Phonological Theory, J. A. Goldsmith, Ed. Cambridge, U.K.: Blackwell, 1995.
    • (1995) Handbook of Phonological Theory
    • Clements, G.N.1    Hume, E.V.2
  • 18
    • 0001887625 scopus 로고
    • Performing fine phonetic distinctions: Templates versus features
    • J. S. Perkell and D. Klatt, Eds. Hillsdale, NJ: Lawrence, Erlbaum Assoc. ch. 15
    • R. Cole, R. M. Stern, and M. J. Lasry, "Performing fine phonetic distinctions: Templates versus features," in Invariance and Variability of Speech Processes, J. S. Perkell and D. Klatt, Eds. Hillsdale, NJ: Lawrence Erlbaum Assoc., 1986, ch. 15, pp. 325-345.
    • (1986) Invariance and Variability of Speech Processes , pp. 325-345
    • Cole, R.1    Stern, R.M.2    Lasry, M.J.3
  • 20
    • 0026372938 scopus 로고
    • Microstructural speech units and their HMM representations for discrete utterance speech recognition
    • L. Deng and K. Erler, "Microstructural speech units and their HMM representations for discrete utterance speech recognition," in Proc. ICASSP, 1991, pp. 193-196.
    • (1991) Proc. ICASSP , pp. 193-196
    • Deng, L.1    Erler, K.2
  • 21
    • 0026854213 scopus 로고
    • A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal
    • L. Deng, "A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal," Signal Process., vol. 27, no. 1, pp. 65-78, 1992.
    • (1992) Signal Process. , vol.27 , Issue.1 , pp. 65-78
    • Deng, L.1
  • 22
    • 0028234947 scopus 로고
    • A statistical approach to ASR using atomic units constructed from overlapping articulatory features
    • L. Deng and D. Sun, "A statistical approach to ASR using atomic units constructed from overlapping articulatory features," J. Acoust. Soc. Amer., vol. 95, pp. 2702-2719, 1994.
    • (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 2702-2719
    • Deng, L.1    Sun, D.2
  • 23
    • 85079090910 scopus 로고
    • Phonetic classification and recognition using HMM representation of overlapping articulator features for all classes of English sounds
    • L. Deng and D. Sun, "Phonetic classification and recognition using HMM representation of overlapping articulator features for all classes of English sounds," in Proc. ICASSP, 1994, pp. 45-47.
    • (1994) Proc. ICASSP , pp. 45-47
    • Deng, L.1    Sun, D.2
  • 24
    • 0031198059 scopus 로고    scopus 로고
    • Production models as a structural basis for automatic speech recognition
    • L. Deng, G. Ramsay, and D. Sun, "Production models as a structural basis for automatic speech recognition," Spec. Iss. Speech Prod. Modeling, Speech Commun., vol. 22, no. 2, pp. 93-112, 1997.
    • (1997) Spec. Iss. Speech Prod. Modeling, Speech Commun. , vol.22 , Issue.2 , pp. 93-112
    • Deng, L.1    Ramsay, G.2    Sun, D.3
  • 25
    • 0032119268 scopus 로고    scopus 로고
    • A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition
    • L. Deng, "A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition,"Speech Commun., vol. 24, no. 4, pp. 299-323, 1998.
    • (1998) Speech Commun. , vol.24 , Issue.4 , pp. 299-323
    • Deng, L.1
  • 26
    • 0033623527 scopus 로고    scopus 로고
    • Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
    • L. Deng and J. Ma, "Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics," J. Acoust. Soc. Amer., vol. 108, no. 6, pp. 3036-3048, 2000.
    • (2000) J. Acoust. Soc. Amer. , vol.108 , Issue.6 , pp. 3036-3048
    • Deng, L.1    Ma, J.2
  • 27
    • 4544323815 scopus 로고    scopus 로고
    • A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances
    • L. Deng, L. Lee, H. Attias, and A. Acero, "A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances," in Proc. ICASSP, 2004, pp. I557-I560.
    • (2004) Proc. ICASSP
    • Deng, L.1    Lee, L.2    Attias, H.3    Acero, A.4
  • 29
    • 27644525945 scopus 로고    scopus 로고
    • Use of temporal information: Detection of the periodicity and aperiodicity profile of speech
    • Sep.
    • O.Deshmukh, C. Espy-Wilson, A.Salomon, and J. Singh, "Use of temporal information: Detection of the periodicity and aperiodicity profile of speech," IEEE Trans. Speech Audio Process., vol. 13, no. 5, pp. 776-786, Sep. 2005.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.5 , pp. 776-786
    • Deshmukh, O.1    Espy-Wilson, C.2    Salomon, A.3    Singh, J.4
  • 31
    • 0006036234 scopus 로고
    • Phoneme recognition with an artificial neural network
    • K. Elenius and G. Tacacs, "Phoneme recognition with an artificial neural network," in Proc. Eurospeech, 1991, pp. 121-124.
    • (1991) Proc. Eurospeech , pp. 121-124
    • Elenius, K.1    Tacacs, G.2
  • 32
    • 0037518143 scopus 로고
    • Comparing phoneme and feature based speech recognition using artificial neural networks
    • K. Elenius and M. Blomberg, "Comparing phoneme and feature based speech recognition using artificial neural networks," in Proc. ICSLP, 1992, pp. 1279-1282.
    • (1992) Proc. ICSLP , pp. 1279-1282
    • Elenius, K.1    Blomberg, M.2
  • 33
    • 0027627252 scopus 로고
    • Hidden Markov model representation of quantized articulatory features for speech recognition
    • K. Erler and L. Deng, "Hidden Markov model representation of quantized articulatory features for speech recognition," Comput., Speech, Lang., vol. 7, pp. 265-282, 1993.
    • (1993) Comput., Speech, Lang. , vol.7 , pp. 265-282
    • Erler, K.1    Deng, L.2
  • 34
    • 33646663971 scopus 로고    scopus 로고
    • The relevance of F4 in distinguishing between different articulatory configurations of American English/r/
    • C. Y. Espy-Wilson and S. E. Boyce, "The relevance of F4 in distinguishing between different articulatory configurations of American English/r/," J. Acoust. Soc. Amer., vol. 105, no. 2, p. 1400, 1999.
    • (1999) J. Acoust. Soc. Amer. , vol.105 , Issue.2 , pp. 1400
    • Espy-Wilson, C.Y.1    Boyce, S.E.2
  • 36
    • 0013631878 scopus 로고
    • Coordination and coarticulation in speech production
    • C. A. Fowler and E. Saltzman, "Coordination and coarticulation in speech production," Lang. Speech, vol. 36, pp. 171-195, 1993.
    • (1993) Lang. Speech , vol.36 , pp. 171-195
    • Fowler, C.A.1    Saltzman, E.2
  • 37
    • 0002441991 scopus 로고    scopus 로고
    • Coarticulation resistance of American English consonants and its effects on transconsonantal vowel-to-vowel coarticulation
    • C. A. Fowler and L. Brancazio, "Coarticulation resistance of American English consonants and its effects on transconsonantal vowel-to-vowel coarticulation," Lang. Speech, vol. 43, pp. 1-42, 2000.
    • (2000) Lang. Speech , vol.43 , pp. 1-42
    • Fowler, C.A.1    Brancazio, L.2
  • 38
    • 20444400371 scopus 로고    scopus 로고
    • Speech production and perception
    • A. Healy and R. Proctor, Eds. New York: Wiley, Experimental Psychology
    • C. A. Fowler, "Speech production and perception," in Handbook of Psychology, A. Healy and R. Proctor, Eds. New York: Wiley, 2003, vol. 4, Experimental Psychology, pp. 237-266.
    • (2003) Handbook of Psychology , vol.4 , pp. 237-266
    • Fowler, C.A.1
  • 39
    • 84994254645 scopus 로고    scopus 로고
    • An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces
    • J. Frankel, K. Richmond, S. King, and P. Taylor, "An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces," in Proc. ICSLP, 2000, vol. 4, pp. 254-257.
    • (2000) Proc. ICSLP , vol.4 , pp. 254-257
    • Frankel, J.1    Richmond, K.2    King, S.3    Taylor, P.4
  • 40
    • 58849145971 scopus 로고    scopus 로고
    • ASR\Articulatory speech recognition
    • J. Frankel and S. King, "ASR\Articulatory speech recognition," in Proc. Eurospeech, Denmark, 2001, pp. 599-602.
    • (2001) Proc. Eurospeech, Denmark , pp. 599-602
    • Frankel, J.1    King, S.2
  • 41
    • 85009088992 scopus 로고    scopus 로고
    • Articulatory feature recognition using dynamic Bayesian networks
    • Korea
    • J. Frankel, M. Wester, and S. King, "Articulatory feature recognition using dynamic Bayesian networks," in Proc. Int. Conf. Spoken Lang. Process., Korea, 2004, pp. 1202-1205.
    • (2004) Proc. Int. Conf. Spoken Lang. Process. , pp. 1202-1205
    • Frankel, J.1    Wester, M.2    King, S.3
  • 42
    • 33745225408 scopus 로고    scopus 로고
    • A hybrid ANN/DBN approach to articula-tory feature recognition
    • J. Frankel and S. King, "A hybrid ANN/DBN approach to articula-tory feature recognition," in Proc. Eurospeech, Interspeech, 2005, pp. 3045-3048.
    • (2005) Proc. Eurospeech, Interspeech , pp. 3045-3048
    • Frankel, J.1    King, S.2
  • 43
    • 0015712358 scopus 로고
    • Computer controlled radiography for observation of movements of articulatory and other human organs
    • O. Fujimura, S. Kiritani, and H. Ishida, "Computer controlled radiography for observation of movements of articulatory and other human organs," Comput. Biol. Med., vol. 3, pp. 371-384, 1973.
    • (1973) Comput. Biol. Med. , vol.3 , pp. 371-384
    • Fujimura, O.1    Kiritani, S.2    Ishida, H.3
  • 44
    • 0000154329 scopus 로고
    • Relative invariance of articulatory movements: Aniceberg model
    • J. S. Perkell and D. Klatt, Eds. Mahwah, NJ: Lawrence Erlbaum Assoc. ch. 11
    • O. Fujimura, "Relative invariance of articulatory movements: An iceberg model," in Invariance & Variability of Speech Processes, J. S. Perkell and D. Klatt, Eds. Mahwah, NJ: Lawrence Erlbaum Assoc., 1986, ch. 11, pp. 226-242.
    • (1986) Invariance & Variability of Speech Processes , pp. 226-242
    • Fujimura, O.1
  • 45
    • 0032627247 scopus 로고    scopus 로고
    • Development of rules for controlling the HLsyn speech synthesizer
    • H. M. Hanson, R. S. McGowan, K. N. Stevens, and R. E. Beaudoin, "Development of rules for controlling the HLsyn speech synthesizer," in Proc. ICASSP, 1999, vol. 1, pp. 85-88.
    • (1999) Proc. ICASSP , vol.1 , pp. 85-88
    • Hanson, H.M.1    McGowan, R.S.2    Stevens, K.N.3    Beaudoin, R.E.4
  • 46
    • 0036711819 scopus 로고    scopus 로고
    • A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn
    • H. M. Hanson and K. N. Stevens, "A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn," J. Acoust. Soc. Amer., vol. 112, no. 3, pp. 1158-1182, 2002.
    • (2002) J. Acoust. Soc. Amer. , vol.112 , Issue.3 , pp. 1158-1182
    • Hanson, H.M.1    Stevens, K.N.2
  • 48
    • 78649376063 scopus 로고    scopus 로고
    • Audiovisual speech recognition with articulator positions as hidden variables
    • Germany
    • M. Hasegawa-Johnson, K. Livescu, P. Lal, and K. Saenko, "Audiovisual speech recognition with articulator positions as hidden variables," in Proc. ICPhS, Saarbrucken, Germany, 2007, pp. 297-302.
    • (2007) Proc. ICPhS, Saarbrucken , pp. 297-302
    • Hasegawa-Johnson, M.1    Livescu, K.2    Lal, P.3    Saenko, K.4
  • 50
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G. E. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., vol. 18, pp. 1527-1554, 2006.
    • (2006) Neural Comput. , vol.18 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.3
  • 51
    • 0029843107 scopus 로고    scopus 로고
    • Accurate recovery of articulator positions from acoustics: New conclusions based on human data
    • J. Hogden, A. Löfqvist, V. Gracco, I. Zlokarnik, P. Rubin, and E. Saltzman, "Accurate recovery of articulator positions from acoustics: New conclusions based on human data," J. Acoust. Soc. Amer., vol. 100, no. 3, pp. 1819-1834, 1996.
    • (1996) J. Acoust. Soc. Amer. , vol.100 , Issue.3 , pp. 1819-1834
    • Hogden, J.1    Löfqvist, A.2    Gracco, V.3    Zlokarnik, I.4    Rubin, P.5    Saltzman, E.6
  • 52
    • 33846700692 scopus 로고    scopus 로고
    • An articulatorily constrained, maximum likelihood approach to speech recognition
    • Tech. Rep. LA-UR-
    • J. Hogden, D. Nix, and P. Valdez, "An articulatorily constrained, maximum likelihood approach to speech recognition," Los Alamos National Laboratory, Los Alamos, NM, 1998, Tech. Rep. LA-UR-96-3945.
    • (1998) Los Alamos National Laboratory, Los Alamos, NM , pp. 96-3945
    • Hogden, J.1    Nix, D.2    Valdez, P.3
  • 53
    • 34247647975 scopus 로고    scopus 로고
    • Inverting mappings from smooth paths through Rn to paths through Rm. A technique applied to recovering articulation from acoustics
    • J. Hogden, P. Rubin, E. McDermott, S. Katagiri, and L. Goldstein, "Inverting mappings from smooth paths through Rn to paths through Rm. A technique applied to recovering articulation from acoustics," Speech Commun., vol. 49, no. 5, pp. 361-383, 2007.
    • (2007) Speech Commun. , vol.49 , Issue.5 , pp. 361-383
    • Hogden, J.1    Rubin, P.2    McDermott, E.3    Katagiri, S.4    Goldstein, L.5
  • 54
    • 0036289950 scopus 로고    scopus 로고
    • Triphone based unit selection for concatenative visual speech synthesis
    • F. J. Huang, E. Cosatto, and H. P. Graf, "Triphone based unit selection for concatenative visual speech synthesis," in Proc. ICASSP, Orlando, FL, 2002, vol. 2, pp. 2037-2040.
    • (2002) Proc. ICASSP, Orlando, FL , vol.2 , pp. 2037-2040
    • Huang, F.J.1    Cosatto, E.2    Graf, H.P.3
  • 55
    • 44049116478 scopus 로고
    • Forward models\Supervised learning with a distal teacher
    • M. I. Jordan and D. E. Rumelhart, "Forward models\Supervised learning with a distal teacher," Cogn. Sci., vol. 16, pp. 307-354, 1992.
    • (1992) Cogn. Sci. , vol.16 , pp. 307-354
    • Jordan, M.I.1    Rumelhart, D.E.2
  • 57
    • 0029753859 scopus 로고    scopus 로고
    • Deriving gestural scores from articulator-movement records using weighted temporal decomposition
    • T. P. Jung, A. K. Krishnamurthy, S. C. Ahalt, M. E. Beckman, and S. H. Lee, "Deriving gestural scores from articulator-movement records using weighted temporal decomposition," IEEE Trans. Speech Audio Process., vol. 4, no. 1, pp. 2-18, 1996.
    • (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.1 , pp. 2-18
    • Jung, T.P.1    Krishnamurthy, A.K.2    Ahalt, S.C.3    Beckman, M.E.4    Lee, S.H.5
  • 58
    • 0034853397 scopus 로고    scopus 로고
    • What kind of pronunciation variation is hard for triphones to model?
    • D. Jurafsky, W. Ward, Z. Jianping, K. Herold, Y. Xiuyang, and Z. Sen, "What kind of pronunciation variation is hard for triphones to model?," in Proc. ICASSP, 2001, vol. 1, pp. 577-580.
    • (2001) Proc. ICASSP , vol.1 , pp. 577-580
    • Jurafsky, D.1    Ward, W.2    Jianping, Z.3    Herold, K.4    Xiuyang, Y.5    Sen, Z.6
  • 59
    • 70350574658 scopus 로고    scopus 로고
    • Face active appearance modeling and speech acoustic information to recover articulation
    • Mar.
    • A. Katsamanis, G. Papandreou, and P. Maragos, "Face active appearance modeling and speech acoustic information to recover articulation," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 3, pp. 411-422, Mar. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.3 , pp. 411-422
    • Katsamanis, A.1    Papandreou, G.2    Maragos, P.3
  • 60
    • 0034297586 scopus 로고    scopus 로고
    • Detection of phonological features in continuous speech using neural networks
    • S. King and P. Taylor, "Detection of phonological features in continuous speech using neural networks," Comput., Speech, Lang., vol. 14, no. 4, pp. 333-353, 2000.
    • (2000) Comput., Speech, Lang. , vol.14 , Issue.4 , pp. 333-353
    • King, S.1    Taylor, P.2
  • 61
    • 33745198184 scopus 로고    scopus 로고
    • SVitchboard 1: Small vocabulary tasks from Switchboard 1
    • S. King, C. Bartels, and J. Bilmes, "SVitchboard 1: Small vocabulary tasks from Switchboard 1," in Proc. Interspeech, 2005, pp. 3385-3388.
    • (2005) Proc. Interspeech , pp. 3385-3388
    • King, S.1    Bartels, C.2    Bilmes, J.3
  • 63
    • 0036642567 scopus 로고    scopus 로고
    • Combining acoustic and articulatory feature information for robust speech recognition
    • K. Kirchhoff, G. A. Fink, and G. Sagerer, "Combining acoustic and articulatory feature information for robust speech recognition," Speech Commun., vol. 37, pp. 303-319, 2002.
    • (2002) Speech Commun. , vol.37 , pp. 303-319
    • Kirchhoff, K.1    Fink, G.A.2    Sagerer, G.3
  • 64
    • 78649382951 scopus 로고
    • Application of Neural networks to articulatory motion estimation
    • T. Kobayashi, M. Yagyu, and K. Shirai, "Application of Neural networks to articulatory motion estimation," in Proc. ICASSP, 1985, pp. 1001-1104.
    • (1985) Proc. ICASSP , pp. 1001-1104
    • Kobayashi, T.1    Yagyu, M.2    Shirai, K.3
  • 67
    • 0020300423 scopus 로고
    • Acoustic-phonetic analysis based on an articulatory model
    • J. P. Hayton, Ed. Dordrecht, The Netherlands: D. Reidel
    • B. Lochschmidt, "Acoustic-phonetic analysis based on an articulatory model," in Automatic Speech Analysis and Recognition, J. P. Hayton, Ed. Dordrecht, The Netherlands: D. Reidel, 1982, pp. 139-152.
    • (1982) Automatic Speech Analysis and Recognition , pp. 139-152
    • Lochschmidt, B.1
  • 69
    • 0001523807 scopus 로고    scopus 로고
    • A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech
    • J. Ma and L. Deng, "A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech," Comput., Speech, Lang., vol. 14, pp. 101-104, 2000.
    • (2000) Comput. Speech, Lang. , vol.14 , pp. 101-104
    • Ma, J.1    Deng, L.2
  • 70
    • 84916524147 scopus 로고
    • Universal and language particular aspects of vowel-to-vowel coarticulation
    • Star. Rep, Speech Res. SR-77/78
    • S. Y. Manuel and R. A. Krakow, "Universal and language particular aspects of vowel-to-vowel coarticulation," Haskins Lab. Star. Rep, Speech Res. SR-77/78, pp. 69-78, 1984.
    • (1984) Haskins Lab. , pp. 69-78
    • Manuel, S.Y.1    Krakow, R.A.2
  • 71
    • 0025162662 scopus 로고
    • The role of contrast in limiting vowel-to-vowel coar-ticulation in different languages
    • S. Y. Manuel, "The role of contrast in limiting vowel-to-vowel coar-ticulation in different languages," J. Acoust. Soc. Amer., vol. 88, pp. 1286-1298, 1990.
    • (1990) J. Acoust. Soc. Amer. , vol.88 , pp. 1286-1298
    • Manuel, S.Y.1
  • 72
    • 29444436962 scopus 로고    scopus 로고
    • Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework
    • DOI 10.1016/j.specom.2005.07.003, PII S0167639305001731
    • K. Markov, J. Dang, and S. Nakamura, "Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework," Speech Commun., vol. 48, pp. 161-175, 2006. (Pubitemid 43012029)
    • (2006) Speech Communication , vol.48 , Issue.2 , pp. 161-175
    • Markov, K.1    Dang, J.2    Nakamura, S.3
  • 73
    • 63149189029 scopus 로고
    • Phonetics and linguistic evolution
    • B. Malmberg, Ed Amsterdam, The Netherlands: North-Holland
    • A. Martinet, "Phonetics and linguistic evolution," in Manual of Phon., B. Malmberg, Ed. Amsterdam, The Netherlands: North-Holland, 1957, pp. 252-272.
    • (1957) Manual of Phon. , pp. 252-272
    • Martinet, A.1
  • 74
    • 0028375762 scopus 로고
    • Recovering articulatory movement from formant frequency trajectories using task dynamics and a genetic algorithm: Preliminary model tests
    • R. S. McGowan, "Recovering articulatory movement from formant frequency trajectories using task dynamics and a genetic algorithm: Preliminary model tests," Speech Commun., vol. 14, no. 1, pp. 19-48, 1994.
    • (1994) Speech Commun. , vol.14 , Issue.1 , pp. 19-48
    • McGowan, R.S.1
  • 75
    • 85009240321 scopus 로고    scopus 로고
    • A flexible stream architecture for ASR using articulatory features
    • F. Metze and A. Waibel, "A flexible stream architecture for ASR using articulatory features," in Proc. ICSLP, 2002, pp. 2133-2136.
    • (2002) Proc. ICSLP , pp. 2133-2136
    • Metze, F.1    Waibel, A.2
  • 76
    • 0031624622 scopus 로고    scopus 로고
    • Improved phone recognition using Bayesian Triphone Models
    • J. Ming and F. J. Smith, "Improved phone recognition using Bayesian Triphone Models," in Proc. ICASSP, 1998, pp. 409-412.
    • (1998) Proc. ICASSP , pp. 409-412
    • Ming, J.1    Smith, F.J.2
  • 78
  • 79
    • 78649390028 scopus 로고    scopus 로고
    • A step in the realization of a speech recognition system based on gestural phonology and landmarks
    • Portland J. Acoust. Soc. Amer.
    • V. Mitra, H. Nam, and C. Espy-Wilson, "A step in the realization of a speech recognition system based on gestural phonology and landmarks," in Proc. 157th Meeting ASA, Portland, 2009, vol. 125, J. Acoust. Soc. Amer., p. 2530.
    • (2009) Proc. 157th Meeting ASA , vol.125 , pp. 2530
    • Mitra, V.1    Nam, H.2    Espy-Wilson, C.3
  • 82
    • 0027205884 scopus 로고
    • A scaled conjugate gradient algorithm for fast supervised learning
    • M. F. Moller, "A scaled conjugate gradient algorithm for fast supervised learning," Neural Netw., vol. 6, pp. 525-533, 1993.
    • (1993) Neural Netw. , vol.6 , pp. 525-533
    • Moller, M.F.1
  • 83
    • 0017007706 scopus 로고
    • Automatic detection and description of syllabic features in continuous speech
    • Oct.
    • R. D. Mori, P. Laface, and E. Piccolo, "Automatic detection and description of syllabic features in continuous speech," IEEE Trans. Acoust., Speech Signal Process., vol. 24, no. 5, pp. 365-379, Oct. 1976.
    • (1976) IEEE Trans. Acoust., Speech Signal Process. , vol.24 , Issue.5 , pp. 365-379
    • Mori, R.D.1    Laface, P.2    Piccolo, E.3
  • 84
    • 70349207706 scopus 로고    scopus 로고
    • Tada: An enhanced, portable task dynamics model in Matlab
    • 2
    • H. Nam, L. Goldstein, E. Saltzman, and D. Byrd, "Tada: An enhanced, portable task dynamics model in Matlab," J. Acoust. Soc. Amer., vol. 115, no. 5-2, p. 2430, 2004.
    • (2004) J. Acoust. Soc. Amer. , vol.115 , Issue.5 , pp. 2430
    • Nam, H.1    Goldstein, L.2    Saltzman, E.3    Byrd, D.4
  • 85
    • 84867222549 scopus 로고    scopus 로고
    • The acoustic to articulation mapping: Non-linear or Non-unique?
    • D. Neiberg, G. Ananthakrishnan, and O. Engwall, "The acoustic to articulation mapping: Non-linear or Non-unique?," in Proc. Interspeech, 2008, pp. 1485-1488.
    • (2008) Proc. Interspeech , pp. 1485-1488
    • Neiberg, D.1    Ananthakrishnan, G.2    Engwall, O.3
  • 86
    • 0013871855 scopus 로고
    • Coarticulation in VCV utterances: Spectrographic measurements
    • S. E. G. Ohman, "Coarticulation in VCV utterances: Spectrographic measurements," J. Acoust. Soc. Amer., vol. 39, pp. 151-168, 1966.
    • (1966) J. Acoust. Soc. Amer. , vol.39 , pp. 151-168
    • Ohman, S.E.G.1
  • 87
    • 0010505818 scopus 로고    scopus 로고
    • Recovery of articulatory movements from acoustics with phonemic information
    • Bavaria, Germany
    • T. Okadome, S. Suzuki, and M. Honda, "Recovery of articulatory movements from acoustics with phonemic information," in Proc. 5th Seminar Speech Production, Bavaria, Germany, 2000, pp. 229-232.
    • (2000) Proc. 5th Seminar Speech Production , pp. 229-232
    • Okadome, T.1    Suzuki, S.2    Honda, M.3
  • 88
    • 0036298107 scopus 로고    scopus 로고
    • Maximum mutual information based acoustic features representation of phonological features for speech recognition
    • M. K. Omar and M. Hasegawa-Johnson, "Maximum mutual information based acoustic features representation of phonological features for speech recognition," in Proc. ICASSP, 2002, vol. 1, pp. 81-84.
    • (2002) Proc. ICASSP , vol.1 , pp. 81-84
    • Omar, M.K.1    Hasegawa-Johnson, M.2
  • 90
    • 0026675669 scopus 로고
    • Inferring articulation and recognizing gestures from acoustics with a neural network trained on X-ray microbeam data
    • G. Papcun, J. Hochberg, T. R. Thomas, F. Laroche, J. Zachs, and S. Levy, "Inferring articulation and recognizing gestures from acoustics with a neural network trained on X-ray microbeam data," J. Acoust. Soc. Amer., vol. 92, no. 2, pp. 688-700, 1992.
    • (1992) J. Acoust. Soc. Amer. , vol.92 , Issue.2 , pp. 688-700
    • Papcun, G.1    Hochberg, J.2    Thomas, T.R.3    Laroche, F.4    Zachs, J.5    Levy, S.6
  • 91
    • 51449098747 scopus 로고    scopus 로고
    • An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping
    • C. Qin and M. Á. Carreira-Perpiñán, "An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping," in Proc. Interspeech, 2007, pp. 74-77.
    • (2007) Proc. Interspeech , pp. 74-77
    • Qin, C.1    Carreira-Perpiñán, M.Á.2
  • 92
    • 0026396339 scopus 로고
    • Acoustic-to-articulatory parameter mapping using an assembly of neural networks
    • M. G. Rahim, W. B. Kleijn, J. Schroeter, and C. C. Goodyear, "Acoustic-to-articulatory parameter mapping using an assembly of neural networks," in Proc. ICASSP, 1991, pp. 485-488.
    • (1991) Proc. ICASSP , pp. 485-488
    • Rahim, M.G.1    Kleijn, W.B.2    Schroeter, J.3    Goodyear, C.C.4
  • 94
    • 0021687548 scopus 로고
    • Timing constraints and coarticulation: Alveolo-palatals and sequences of alveolar + [j] in Catalan
    • D. Recasens, "Timing constraints and coarticulation: Alveolo-palatals and sequences of alveolar + [j] in Catalan," Phonetica, vol. 41, pp. 125-139, 1984.
    • (1984) Phonetica , vol.41 , pp. 125-139
    • Recasens, D.1
  • 96
    • 38549178971 scopus 로고    scopus 로고
    • Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion
    • K.Richmond, "Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion," Lecture Notes in Comput. Sci., vol. 4885/2007, pp. 263-272, 2007.
    • (2007) Lecture Notes in Comput. Sci. , vol.2007-4885 , pp. 263-272
    • Richmond, K.1
  • 98
    • 77956779481 scopus 로고
    • A dynamical approach to gestural patterning in speech production
    • E. Saltzman and K. Munhall, "A dynamical approach to gestural patterning in speech production," Ecol. Psychol., vol. 1, no. 4, pp. 332-382, 1989.
    • (1989) Ecol. Psychol. , vol.1 , Issue.4 , pp. 332-382
    • Saltzman, E.1    Munhall, K.2
  • 99
    • 0024906981 scopus 로고
    • Robust statistic modelling of systematic variabilities in continuous speech incorporating acoustic-articulatory relations
    • O. Schmidbauer, "Robust statistic modelling of systematic variabilities in continuous speech incorporating acoustic-articulatory relations," in Proc. ICASSP, 1989, pp. 616-619.
    • (1989) Proc. ICASSP , pp. 616-619
    • Schmidbauer, O.1
  • 101
    • 0008499181 scopus 로고
    • Estimating articulatory motion from speech wave
    • K. Shirai and T. Kobayashi, "Estimating articulatory motion from speech wave," Speech Commun., vol. 5, pp. 159-170, 1986.
    • (1986) Speech Commun. , vol.5 , pp. 159-170
    • Shirai, K.1    Kobayashi, T.2
  • 102
    • 4043137356 scopus 로고    scopus 로고
    • A tutorial on support vector regression
    • A. Smola and B. Scholkhopf, "A tutorial on support vector regression," Statist. Comput., vol. 14, no. 3, pp. 199-222, 2004.
    • (2004) Statist. Comput. , vol.14 , Issue.3 , pp. 199-222
    • Smola, A.1    Scholkhopf, B.2
  • 103
    • 84939672029 scopus 로고
    • Toward a model for speech recognition
    • K. N. Stevens, "Toward a model for speech recognition," J. Acoust. Soc. Amer., vol. 32, pp. 47-55, 1960.
    • (1960) J. Acoust. Soc. Amer. , vol.32 , pp. 47-55
    • Stevens, K.N.1
  • 104
    • 0008796094 scopus 로고    scopus 로고
    • Revisiting place of articulation measures for stop consonants: Implications for models of consonant production
    • K. N. Stevens, S. Manuel, and M. Matthies, "Revisiting place of articulation measures for stop consonants: Implications for models of consonant production," in Proc. Int. Cong. Phon. Sci., 1999, vol. 2, pp. 1117-1120.
    • (1999) Proc. Int. Cong. Phon. Sci. , vol.2 , pp. 1117-1120
    • Stevens, K.N.1    Manuel, S.2    Matthies, M.3
  • 106
    • 0036219864 scopus 로고    scopus 로고
    • Toward a model for lexical access based on acoustic landmarks and distinctive features
    • K. N. Stevens, "Toward a model for lexical access based on acoustic landmarks and distinctive features," J. Acoust. Soc. Amer., vol. 111, no. 4, pp. 1872-1891, 2002.
    • (2002) J. Acoust. Soc. Amer. , vol.111 , Issue.4 , pp. 1872-1891
    • Stevens, K.N.1
  • 107
    • 78649348268 scopus 로고    scopus 로고
    • Annotation and use of speech production corpus for building language-universal speech recognizers
    • Beijing, China Oct.
    • J. Sun and L. Deng, "Annotation and use of speech production corpus for building language-universal speech recognizers," in Proc. 2nd Int. Symp. Chinese Spoken Lang. Processi. ISCSLP, Beijing, China, Oct. 2000, vol. 3, pp. 31-34.
    • (2000) Proc. 2nd Int. Symp. Chinese Spoken Lang. Processi. ISCSLP , vol.3 , pp. 31-34
    • Sun, J.1    Deng, L.2
  • 108
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition
    • J. Sun and L. Deng, "An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition," J. Acoust. Soc. Amer., vol. 111, no. 2, pp. 1086-1101, 2002.
    • (2002) J. Acoust. Soc. Amer. , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.1    Deng, L.2
  • 109
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum likelihood estimation of speech parameter trajectory
    • Nov.
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of speech parameter trajectory," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 110
    • 0344443787 scopus 로고    scopus 로고
    • Joint state and parameter estimation for a target-directed nonlinear dynamic system model
    • Dec.
    • R. Togneri and L. Deng, "Joint state and parameter estimation for a target-directed nonlinear dynamic system model," IEEE Trans. Signal Process., vol. 51, no. 12, pp. 3061-3070, Dec. 2003.
    • (2003) IEEE Trans. Signal Process. , vol.51 , Issue.12 , pp. 3061-3070
    • Togneri, R.1    Deng, L.2
  • 111
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-based speech synthesis
    • Jun.
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP, Jun. 2000, vol. 3, pp. 1315-1318.
    • (2000) Proc. ICASSP , vol.3 , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 112
    • 33745288610 scopus 로고    scopus 로고
    • A support vector approach to the acoustic-to-articulatory mapping
    • A. Toutios and K. Margaritis, "A support vector approach to the acoustic-to-articulatory mapping," in Proc. Interspeech, 2005, pp. 3221-3224.
    • (2005) Proc. Interspeech , pp. 3221-3224
    • Toutios, A.1    Margaritis, K.2
  • 113
    • 78649366518 scopus 로고    scopus 로고
    • Learning articulation from cepstral coefficients
    • A. Toutios and K. Margaritis, "Learning articulation from cepstral coefficients," in Proc. SPECOM, 2005.
    • (2005) Proc. SPECOM
    • Toutios, A.1    Margaritis, K.2
  • 116
    • 26444603266 scopus 로고    scopus 로고
    • A dutch treatment of an elitist approach to articulatory-acoustic feature classification
    • M. Wester, S. Greenberg, and S. Chang, "A dutch treatment of an elitist approach to articulatory-acoustic feature classification," in Proc. Eu-rospeech, 2001, pp. 1729-1732.
    • (2001) Proc. Eu-rospeech , pp. 1729-1732
    • Wester, M.1    Greenberg, S.2    Chang, S.3
  • 119
    • 78649366959 scopus 로고
    • A probabilistic framework for word recognition using phonetic features
    • C. Windheuser, F. Bimbot, and P. Haffner, "A probabilistic framework for word recognition using phonetic features," in Proc. ICSLP, 1994, pp. 287-290.
    • (1994) Proc. ICSLP , pp. 287-290
    • Windheuser, C.1    Bimbot, F.2    Haffner, P.3
  • 121
    • 0028464701 scopus 로고
    • A new neural network for articula-tory speech recognition and its application to vowel identification
    • J. Zachs and T. R. Thomas, "A new neural network for articula-tory speech recognition and its application to vowel identification," Comput., Speech, Lang., vol. 8, pp. 189-209, 1994.
    • (1994) Comput. Speech, Lang. , vol.8 , pp. 189-209
    • Zachs, J.1    Thomas, T.R.2
  • 122
    • 84867193584 scopus 로고    scopus 로고
    • The entropy of articulatory phonological code: Recognizing gestures from tract variables
    • X. Zhuang, H. Nam, M. Hasegawa-Johnson, L. Goldstein, and E. Saltzman, "The entropy of articulatory phonological code: Recognizing gestures from tract variables," in Proc. Interspeech, 2008, pp. 1489-1492.
    • (2008) Proc. Interspeech , pp. 1489-1492
    • Zhuang, X.1    Nam, H.2    Hasegawa-Johnson, M.3    Goldstein, L.4    Saltzman, E.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.