SCOPUS 정보 검색 플랫폼

Journal of the Acoustical Society of America

Volumn 131, Issue 3, 2012, Pages 2270-2287

Recognizing articulatory gestures from speech for robust speech recognition

(5) Mitra, Vikramjit a Nam, Hosung b Espy Wilson, Carol c Saltzman, Elliot d Goldstein, Louis e

a SRI INTERNATIONAL (United States)

b HASKINS LABORATORIES (United States)

c UNIVERSITY OF MARYLAND (United States)

d BOSTON UNIVERSITY (United States)

e UNIVERSITY OF SOUTHERN CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ARTICULATORY GESTURES; AUTOMATIC SPEECH RECOGNITION SYSTEM; DATA SETS; DIGIT RECOGNITION; DYNAMIC BAYESIAN NETWORK; DYNAMIC PARAMETERS; NATURAL SPEECH; RECOGNITION PERFORMANCE; RECOGNITION RATES; ROBUST SPEECH RECOGNITION; SPEECH RECOGNITION SYSTEMS; SPEECH SIGNALS; SYNTHETIC SPEECH; THREE STAGES; WORD RECOGNITION;

GESTURE RECOGNITION; NETWORK ARCHITECTURE; SPEECH SYNTHESIS; VOCABULARY CONTROL;

SPEECH RECOGNITION;

ARTICLE; AUTOMATIC SPEECH RECOGNITION; BAYES THEOREM; GESTURE; HUMAN; LINGUISTICS; PHONETICS; PHYSIOLOGY; SPEECH; SPEECH PERCEPTION;

BAYES THEOREM; GESTURES; HUMANS; PHONETICS; SPEECH; SPEECH ACOUSTICS; SPEECH PERCEPTION; SPEECH RECOGNITION SOFTWARE; VOCABULARY;

EID: 84858976368 PISSN: 00014966 EISSN: None Source Type: Journal
DOI: 10.1121/1.3682038 Document Type: Article

Times cited : (27)

References (63)

1
- 0020602364
- Efficient coding of LPC parameters by temporal decomposition
- Boston, MA
- Atal, B. S. (1983), Efficient coding of LPC parameters by temporal decomposition., Proceedings of ICASSP, Boston, MA, pp. 81-84.
- (1983) Proceedings of ICASSP , pp. 81-84
- Atal, B.S.¹

2
- 34547975052
- Scaling learning algorithms toward AI
- edited by L. Bottou, O. Chapelle, D. De-Coste, and J. Weston (MIT Press, Cambridge, MA)
- Bengio, Y., and Le Cun, Y. (2007), Scaling learning algorithms toward AI., in Large Scale Kernel Machines, edited by, L. Bottou, O. Chapelle, D. De-Coste, and, J. Weston, (MIT Press, Cambridge, MA), pp. 321-360.
- (2007) Large Scale Kernel Machines , pp. 321-360
- Bengio, Y.¹ Le Cun, Y.²

3
- 0036293559
- The graphical models Toolkit: An open source software system for speech and time-series processing
- Orlando, FL
- Bilmes, J., and Zweig, G. (2002), The graphical models Toolkit: An open source software system for speech and time-series processing., Proceedings of ICASSP, Orlando, FL, Vol. 4, pp. 3916-3919.
- (2002) Proceedings of ICASSP , vol.4 , pp. 3916-3919
- Bilmes, J.¹ Zweig, G.²

4
- 84971737266
- Articulatory gestures as phonological units
- 10.1017/S0952675700001019
- Browman, C., and Goldstein, L. (1989), Articulatory gestures as phonological units., Phonology 6, 201-251. 10.1017/S0952675700001019
- (1989) Phonology , vol.6 , pp. 201-251
- Browman, C.¹ Goldstein, L.²

5
- 0027024362
- Articulatory phonology: An overview
- 10.1159/000261913
- Browman, C., and Goldstein, L. (1992), Articulatory phonology: An overview., Phonetica 49, 155-180. 10.1159/000261913
- (1992) Phonetica , vol.49 , pp. 155-180
- Browman, C.¹ Goldstein, L.²

6
- 42549139762
- MVA processing of speech features
- 10.1109/TASL.2006.876717
- Chen, C., and Bilmes, J. (2007), MVA processing of speech features., IEEE Trans. Audio Speech Lang. Processing 15 (1), 257-270. 10.1109/TASL.2006.876717
- (2007) IEEE Trans. Audio Speech Lang. Processing , vol.15 , Issue.1 , pp. 257-270
- Chen, C.¹ Bilmes, J.²

7
- 0004119259
- (MIT Press, Cambridge, MA)
- Chomsky, N., and Halle, M. (1968), The Sound Pattern of English (MIT Press, Cambridge, MA), 484 pp.
- (1968) The Sound Pattern of English , pp. 484
- Chomsky, N.¹ Halle, M.²

8
- 33750368310
- An audio-visual corpus for speech perception and automatic speech recognition
- DOI 10.1121/1.2229005
- Cooke, M., Barker, J., Cunningham, S., and Shao, X. (2006), An audio-visual corpus for speech perception and automatic speech recognition., J. Acoust. Soc. Am. 120, 2421-2424. 10.1121/1.2229005 (Pubitemid 44631681)
- (2006) Journal of the Acoustical Society of America , vol.120 , Issue.5 , pp. 2421-2424
- Cooke, M.¹ Barker, J.² Cunningham, S.³ Shao, X.⁴

9
- 27744539597
- Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR
- DOI 10.1109/TSA.2005.853002
- Cui, X., and Alwan, A. (2005), Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR., IEEE Trans. Speech Audio Processing 13 (6), 1161-1172. 10.1109/TSA.2005.853002 (Pubitemid 41605019)
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.6 , pp. 1161-1172
- Cui, X.¹ Alwan, A.²

10
- 61449175648
- The motor somatotopy of speech perception
- 10.1016/j.cub.2009.01.017
- D'Ausilio, A., Pulvermüller, F., Salmas, P., Bufalari, I., Begliomini, C., and Fadiga, C. (2009), The motor somatotopy of speech perception., Curr. Biol. 19, 381-385. 10.1016/j.cub.2009.01.017
- (2009) Curr. Biol. , vol.19 , pp. 381-385
- D'Ausilio, A.¹ Pulvermüller, F.² Salmas, P.³ Bufalari, I.⁴ Begliomini, C.⁵ Fadiga, C.⁶

11
- 0003396255
- The MathWorks Inc., Natick, MA. (Last viewed June 28, 2010)
- Demuth, H., Beale, M., and Hagan, M. (2008), Neural Network ToolboxTM6, User's Guide., The MathWorks Inc., Natick, MA. www.mathworks.com/access/ helpdesk/help/pdf-doc/nnet/nnet.pdf (Last viewed June 28, 2010).
- (2008) Neural Network ToolboxTM6, User's Guide
- Demuth, H.¹ Beale, M.² Hagan, M.³

12
- 0028234947
- A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features
- DOI 10.1121/1.409839
- Deng, L., and Sun, D. (1994), A statistical approach to automatic speech recognition using atomic units constructed from overlapping articulatory features., J. Acoust. Soc. Am. 95 (5), 2702-2719. 10.1121/1.409839 (Pubitemid 24152864)
- (1994) Journal of the Acoustical Society of America , vol.95 , Issue.5 , pp. 2702-2719
- Deng, L.¹ Sun, D.X.²

13
- 27644525945
- Use of temporal information: Detection of periodicity, aperiodicity, and pitch in speech
- DOI 10.1109/TSA.2005.851910
- Deshmukh, O., Espy-Wilson, C., Salomon, A., and Singh, J. (2005), Use of temporal information: Detection of the periodicity and aperiodicity profile of speech., IEEE Trans. Speech Audio Process. 13 (5), 776-786. 10.1109/TSA.2005. 851910 (Pubitemid 41558894)
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.5 , pp. 776-786
- Deshmukh, O.¹ Espy-Wilson, C.Y.² Salomon, A.³ Singh, J.⁴

14
- 0442317754
- European Telecommunications Standards Institute-Advanced Front End , , ES 202 050 Ver. 1.1.5
- European Telecommunications Standards Institute-Advanced Front End (2007), Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; Adv. frontend feature extraction algorithm; Compression algorithms., ES 202 050 Ver. 1.1.5
- (2007) Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Adv. Frontend Feature Extraction Algorithm; Compression Algorithms

15
- 52949093125
- Combined speech enhancement and auditory modelling for robust distributed speech recognition
- 10.1016/j.specom.2008.05.004
- Flynn, R., and Jones, E. (2008), Combined speech enhancement and auditory modelling for robust distributed speech recognition., Speech Comm. 50, 797-809. 10.1016/j.specom.2008.05.004
- (2008) Speech Comm. , vol.50 , pp. 797-809
- Flynn, R.¹ Jones, E.²

16
- 58849145971
- ASR-Articulatory speech recognition
- Aalborg, Denmark
- Frankel, J., and King, S. (2001), ASR-Articulatory speech recognition., Proceedings of Eurospeech, Aalborg, Denmark, pp. 599-602.
- (2001) Proceedings of Eurospeech , pp. 599-602
- Frankel, J.¹ King, S.²

17
- 33745225408
- A hybrid ANN/DBN approach to articulatory feature recognition
- Lisbon, Portugal
- Frankel, J., and King, S. (2005), A hybrid ANN/DBN approach to articulatory feature recognition., Proc. of Eurospeech, Interspeech, Lisbon, Portugal, pp. 3045-3048.
- (2005) Proc. of Eurospeech, Interspeech , pp. 3045-3048
- Frankel, J.¹ King, S.²

18
- 85009088992
- Articulatory feature recognition using dynamic Bayesian networks
- Jeju, Korea
- Frankel, J., Wester, M., and King, S. (2004), Articulatory feature recognition using dynamic Bayesian networks., Proc. of ICSLP, Jeju, Korea, pp. 1202-1205.
- (2004) Proc. of ICSLP , pp. 1202-1205
- Frankel, J.¹ Wester, M.² King, S.³

19
- 0002049440
- Learning Dynamic Bayesian Networks
- Adaptive Processing of Sequences and Data Structures
- Ghahramani, Z. (1998), Learning dynamic Bayesian networks., in Adaptive Processing of Temporal Information, edited by, C. L. Giles, and, M. Gori, (Springer-Verlag, Berlin), pp. 168-197. (Pubitemid 128056031)
- (1998) Lecture Notes in Computer Science , Issue.1387 , pp. 168-197
- Ghahramani, Z.¹

20
- 70450214496
- Estimation of articulatory gesture patterns from speech acoustics
- Brighton, UK
- Ghosh, P., Narayanan, S., Divenyi, P., Goldstein, L., and Saltzman, E. (2009), Estimation of articulatory gesture patterns from speech acoustics., Proceedings of Interspeech, Brighton, UK, pp. 2803-2806.
- (2009) Proceedings of Interspeech , pp. 2803-2806
- Ghosh, P.¹ Narayanan, S.² Divenyi, P.³ Goldstein, L.⁴ Saltzman, E.⁵

21
- 0024909979
- Some statistical issues in the comparison of speech recognition algorithms
- Gillick, L., and Cox, S. J. (1989), Some statistical issues in the comparison of speech recognition algorithms., Proceedings of ICASSP, pp. 532-535. (Pubitemid 20604171)
- (1989) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 532-535
- Gillick, L.¹ Cox, S.J.²

22
- 0036711819
- A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn
- DOI 10.1121/1.1498851
- Hanson, H. M., and Stevens, K. N. (2002), A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn., J. Acoust. Soc. Am. 112 (3), 1158-1182. 10.1121/1.1498851 (Pubitemid 35006671)
- (2002) Journal of the Acoustical Society of America , vol.112 , Issue.3 , pp. 1158-1182
- Hanson, H.M.¹ Stevens, K.N.²

23
- 0028517164
- RASTA processing of speech
- 10.1109/89.326616
- Hermansky, H., and Morgan, N. (1994), RASTA processing of speech., IEEE Trans. Speech Audio Process., 2, 578-589. 10.1109/89.326616
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

24
- 0033709098
- Tandem connectionist feature stream extraction for conventional HMM systems
- Istanbul, Turkey
- Hermansky, H., Ellis, D., and Sharma, S. (2000), Tandem connectionist feature stream extraction for conventional HMM systems., Proceedings of ICASSP, Istanbul, Turkey, pp. 1635-1638.
- (2000) Proceedings of ICASSP , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

25
- 33846700692
- Tech. Report, LA-UR-96-3945 (Los Alamos National Laboratory, Los Alamos, NM)
- Hogden, J., Nix, D., and Valdez, P. (1998), An articulatorily constrained, maximum likelihood approach to speech recognition., Tech. Report, LA-UR-96-3945 (Los Alamos National Laboratory, Los Alamos, NM).
- (1998) An Articulatorily Constrained, Maximum Likelihood Approach to Speech Recognition
- Hogden, J.¹ Nix, D.² Valdez, P.³

26
- 79959812754
- FSM-based pronunciation modeling using articulatory phonological code
- Hu, C., Zhuang, X., and Hasegawa-Johnson, M. (2010), FSM-based pronunciation modeling using articulatory phonological code., Proceedings of Interspeech, pp. 2274-2277.
- (2010) Proceedings of Interspeech , pp. 2274-2277
- Hu, C.¹ Zhuang, X.² Hasegawa-Johnson, M.³

27
- 0004149277
- Preliminaries to speech analysis: The distinctive features and their correlates
- (MIT Press, Cambridge, MA)
- Jakobson, R., Fant, C. G. M., and Halle, M. (1952), Preliminaries to speech analysis: The distinctive features and their correlates., MIT Acoustics Laboratory Technical Report 13 (MIT Press, Cambridge, MA).
- (1952) MIT Acoustics Laboratory Technical Report 13
- Jakobson, R.¹ Fant, C.G.M.² Halle, M.³

28
- 44049116478
- Forward models-supervised learning with a distal teacher
- 10.1207/s15516709cog1603-1
- Jordan, M. I., and Rumelhart, D. E. (1992), Forward models-supervised learning with a distal teacher., Cogn. Sci. 16, 307-354. 10.1207/ s15516709cog1603-1
- (1992) Cogn. Sci. , vol.16 , pp. 307-354
- Jordan, M.I.¹ Rumelhart, D.E.²

29
- 33750725541
- Ph.D. thesis, University of Maryland, College Park, MD
- Juneja, A., (2004), Speech recognition based on phonetic features and acoustic landmarks., Ph.D. thesis, University of Maryland, College Park, MD.
- (2004) Speech Recognition Based on Phonetic Features and Acoustic Landmarks
- Juneja, A.¹

30
- 0029753859
- Deriving gestural scores from articulator-movement records using weighted temporal decomposition
- 10.1109/TSA.1996.481448
- Jung, T. P., Krishnamurthy, A. K., Ahalt, S. C., Beckman, M. E., and Lee, S. H. (1996), Deriving gestural scores from articulator-movement records using weighted temporal decomposition., IEEE Trans. Speech Audio Process. 4 (1), 2-18. 10.1109/TSA.1996.481448
- (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.1 , pp. 2-18
- Jung, T.P.¹ Krishnamurthy, A.K.² Ahalt, S.C.³ Beckman, M.E.⁴ Lee, S.H.⁵

31
- 0034853397
- What kind of pronunciation variation is hard for triphones to model?
- Jurafsky, D., Ward, W., Jianping, Z., Herold, K., Xiuyang, Y., and Sen, Z. (2001), What kind of pronunciation variation is hard for triphones to model?, Proceedings of ICASSP, Utah, Vol. 1, pp. 577-580. (Pubitemid 32839316)
- (2001) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 577-580
- Jurafsky, D.¹ Ward, W.² Zhang, J.³ Herold, K.⁴ Yu, X.⁵ Zhang, S.⁶

32
- 33846680938
- Speech production knowledge in automatic speech recognition
- DOI 10.1121/1.2404622
- King, S., Frankel, J., Livescu, K., McDermott, E., Richmond, K., and Wester, M. (2007), Speech production knowledge in automatic speech recognition., J. Acoust. Soc. Am. 121 (2), 723-742. 10.1121/1.2404622 (Pubitemid 46192674)
- (2007) Journal of the Acoustical Society of America , vol.121 , Issue.2 , pp. 723-742
- King, S.¹ Frankel, J.² Livescu, K.³ McDermott, E.⁴ Richmond, K.⁵ Wester, M.⁶

33
- 0003424928
- Ph.D. Thesis, University of Bielefeld, Germany
- Kirchhoff, K. (1999), Robust speech recognition using articulatory information., Ph.D. Thesis, University of Bielefeld, Germany.
- (1999) Robust Speech Recognition Using Articulatory Information
- Kirchhoff, K.¹

34
- 82955170227
- Technical Report LA-UR-88-418, Los Alamos National Library, Los Alamos, NM
- Lapedes, A., and Farber, R. (1988), How neural networks work., Technical Report LA-UR-88-418, Los Alamos National Library, Los Alamos, NM.
- (1988) How Neural Networks Work
- Lapedes, A.¹ Farber, R.²

35
- 0031187171
- Speech recognition by machines and humans
- PII S0167639397000216
- Lippmann, R. (1997), Speech recognition by machines and humans., Speech Comm. 22, 1-15. 10.1016/S0167-6393(97)00021-6 (Pubitemid 127403436)
- (1997) Speech Communication , vol.22 , Issue.1 , pp. 1-15
- Lippmann, R.P.¹

36
- 29444436962
- Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework
- DOI 10.1016/j.specom.2005.07.003, PII S0167639305001731
- Markov, K., Dang, J., and Nakamura, S. (2006), Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework., Speech Comm. 48, 161-175. 10.1016/j.specom.2005.07.003 (Pubitemid 43012029)
- (2006) Speech Communication , vol.48 , Issue.2 , pp. 161-175
- Markov, K.¹ Dang, J.² Nakamura, S.³

37
- 0028375762
- Recovering articulatory movement from formant frequency trajectories using task dynamics and a genetic algorithm: Preliminary model tests
- 10.1016/0167-6393(94)90055-8
- McGowan, R. S. (1994), Recovering articulatory movement from formant frequency trajectories using task dynamics and a genetic algorithm: Preliminary model tests., Speech Comm. 14 (1), 19-48. 10.1016/0167-6393(94)90055-8
- (1994) Speech Comm. , vol.14 , Issue.1 , pp. 19-48
- McGowan, R.S.¹

38
- 34848837678
- The Essential Role of Premotor Cortex in Speech Perception
- DOI 10.1016/j.cub.2007.08.064, PII S0960982207019690
- Meister, I. G., Wilson, S. M., Deblieck, C., Wu, A. D., and Iacoboni, M. (2007), The essential role of premotor cortex in speech perception., Curr. Biol. 17, 1692-1696. 10.1016/j.cub.2007.08.064 (Pubitemid 47503812)
- (2007) Current Biology , vol.17 , Issue.19 , pp. 1692-1696
- Meister, I.G.¹ Wilson, S.M.² Deblieck, C.³ Wu, A.D.⁴ Iacoboni, M.⁵

39
- 70349207706
- TaDA: An enhanced, portable task dynamics model in MATLAB
- Nam, H., Goldstein, L., Saltzman, E., and Byrd, D. (2004), TaDA: An enhanced, portable task dynamics model in MATLAB., J. Acoust. Soc. Am. 115 (5), pp. 2430.
- (2004) J. Acoust. Soc. Am. , vol.115 , Issue.5 , pp. 2430
- Nam, H.¹ Goldstein, L.² Saltzman, E.³ Byrd, D.⁴

40
- 79959846806
- A procedure for estimating gestural scores from natural speech
- Makuhari, Japan
- Nam, H., Mitra, V., Tiede, M., Saltzman, E., Goldstein, L., Espy-Wilson, C., and Hasegawa- Johnson, M. (2010), A procedure for estimating gestural scores from natural speech., Proceedings of Interspeech, Makuhari, Japan, pp. 30-33.
- (2010) Proceedings of Interspeech , pp. 30-33
- Nam, H.¹ Mitra, V.² Tiede, M.³ Saltzman, E.⁴ Goldstein, L.⁵ Espy-Wilson, C.⁶ Hasegawa- Johnson, M.⁷

41
- 78649390043
- Retrieving tract variables from acoustics: A comparison of different machine learning strategies
- 10.1109/JSTSP.2010.2076013
- Mitra, V., Nam, H., Espy-Wilson, C., Saltzman, E., and Goldstein, L. (2010), Retrieving tract variables from acoustics: A comparison of different machine learning strategies., IEEE J. Selected Topics Signal Process. 4, 1027-1045. 10.1109/JSTSP.2010.2076013
- (2010) IEEE J. Selected Topics Signal Process. , vol.4 , pp. 1027-1045
- Mitra, V.¹ Nam, H.² Espy-Wilson, C.³ Saltzman, E.⁴ Goldstein, L.⁵

42
- 80051617129
- Speech inversion: Benefits of tract variables over pellet trajectories
- Prague, Czech Republic
- Mitra, V., Nam, H., Espy-Wilson, C., Saltzman, E., and Goldstein, L. (2011), Speech inversion: Benefits of tract variables over pellet trajectories., Proceedings of International Conference on Acoustics, Speech and Signal Processing, Prague, Czech Republic, pp. 5188-5191.
- (2011) Proceedings of International Conference on Acoustics, Speech and Signal Processing , pp. 5188-5191
- Mitra, V.¹ Nam, H.² Espy-Wilson, C.³ Saltzman, E.⁴ Goldstein, L.⁵

43
- 70349213974
- From acoustics to vocal tract time functions
- Mitra, V., Özbek, I., Nam, H., Zhou, X., and Espy-Wilson, C. (2009), From acoustics to vocal tract time functions., Proceedings of ICASSP, pp. 4497-4500.
- (2009) Proceedings of ICASSP , pp. 4497-4500
- Mitra, V.¹ Özbek, I.² Nam, H.³ Zhou, X.⁴ Espy-Wilson, C.⁵

44
- 84867222549
- The acoustic to articulation mapping: Non-linear or non-unique?
- Brisbane, Australia
- Neiberg, D., Ananthakrishnan, G., and Engwall, O. (2008), The acoustic to articulation mapping: Non-linear or non-unique?, Proceedings of Interspeech, Brisbane, Australia, pp. 1485-1488.
- (2008) Proceedings of Interspeech , pp. 1485-1488
- Neiberg, D.¹ Ananthakrishnan, G.² Engwall, O.³

45
- 4544293504
- Moving beyond the beads-on-a-string' model of speech
- Colorado
- Ostendorf, M. (1999), Moving beyond the beads-on-a-string' model of speech., in Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, Colorado, Vol. 1, pp. 79-83.
- (1999) Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop , vol.1 , pp. 79-83
- Ostendorf, M.¹

46
- 84987702417
- The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- Beijing, China
- Pearce, D., and Hirsch, H. G. (2000), The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions., Proceedings of ICSLP, ASR, Beijing, China, pp. 181-188.
- (2000) Proceedings of ICSLP, ASR , pp. 181-188
- Pearce, D.¹ Hirsch, H.G.²

47
- 85051666064
- Ph.D. Thesis, University of Maryland, College Park, MD
- Pruthi, T. (2007), Analysis, vocal-tract modeling and automatic detection of vowel nasalization., Ph.D. Thesis, University of Maryland, College Park, MD.
- (2007) Analysis, Vocal-tract Modeling and Automatic Detection of Vowel Nasalization
- Pruthi, T.¹

48
- 51449098747
- An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping
- Antwerp, Belgium
- Qin, C., and Carreira-Perpin, M. (2007), An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping., Proceedings of Interspeech, Antwerp, Belgium, pp. 74-77.
- (2007) Proceedings of Interspeech , pp. 74-77
- Qin, C.¹ Carreira-Perpin, M.²

49
- 70449345905
- Analysis of pausing behavior in spontaneous speech using real-time magnetic resonance imaging of articulation
- 10.1121/1.3213452
- Ramanarayanan, V., Bresch, E., Byrd, D., Goldstein, L., and Narayanan, S. (2009), Analysis of pausing behavior in spontaneous speech using real-time magnetic resonance imaging of articulation., J. Acoust. Soc. Am. 126 (5), EL160-EL165. 10.1121/1.3213452
- (2009) J. Acoust. Soc. Am. , vol.126 , Issue.5
- Ramanarayanan, V.¹ Bresch, E.² Byrd, D.³ Goldstein, L.⁴ Narayanan, S.⁵

50
- 0037697284
- Hidden-articulator Markov models for speech recognition
- 10.1016/S0167-6393(03)00031-1
- Richardson, M., Bilmes, J., and Diorio, C. (2003), Hidden-articulator Markov models for speech recognition., Speech Comm. 41 (2-3), 511-529. 10.1016/S0167-6393(03)00031-1
- (2003) Speech Comm. , vol.41 , Issue.23 , pp. 511-529
- Richardson, M.¹ Bilmes, J.² Diorio, C.³

51
- 4243714433
- Ph.D. Thesis, University of Edinburgh, UK
- Richmond, K. (2001), Estimating articulatory parameters from the acoustic speech signal., Ph.D. Thesis, University of Edinburgh, UK.
- (2001) Estimating Articulatory Parameters from the Acoustic Speech Signal
- Richmond, K.¹

52
- 67650105018
- Trajectory mixture density network with multiple mixtures for acoustic-articulatory inversion
- Paris, France
- Richmond, K. (2007), Trajectory mixture density network with multiple mixtures for acoustic-articulatory inversion., ITRW on Non-Linear Speech Processing, NOLISP-07, Paris, France, pp. 67-70.
- (2007) ITRW on Non-Linear Speech Processing, NOLISP-07 , pp. 67-70
- Richmond, K.¹

53
- 0019606728
- An articulatory synthesizer for perceptual research
- DOI 10.1121/1.386780
- Rubin, P. E., Baer, T., and Mermelstein, P. (1981), An articulatory synthesizer for perceptual research., J. Acoust. Soc. Am. 70, 321-328. 10.1121/1.386780 (Pubitemid 11011801)
- (1981) Journal of the Acoustical Society of America , vol.70 , Issue.2 , pp. 321-328
- Rubin, P.¹ Baer, T.² Mermelstein, P.³

54
- 77956779481
- A dynamical approach to gestural patterning in speech production
- 10.1207/s15326969eco0104-2
- Saltzman, E., and Munhall, K. (1989), A dynamical approach to gestural patterning in speech production., Ecol. Psychol. 1 (4), 332-382. 10.1207/s15326969eco0104-2
- (1989) Ecol. Psychol. , vol.1 , Issue.4 , pp. 332-382
- Saltzman, E.¹ Munhall, K.²

55
- 0024906981
- Robust statistic modelling of systematic variabilities in continuous speech incorporating acoustic-articulatory relations
- Schmidbauer, O. (1989), Robust statistic modelling of systematic variabilities in continuous speech incorporating acoustic-articulatory relations., Proceedings of ICASSP, pp. 616-619. (Pubitemid 20604192)
- (1989) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 616-619
- Schmidbauer Otto¹

56
- 84939672029
- Toward a model for speech recognition
- 10.1121/1.1907874
- Stevens, K. (1960), Toward a model for speech recognition., J. Acoust. Soc. Am. 32, 47-55. 10.1121/1.1907874
- (1960) J. Acoust. Soc. Am. , vol.32 , pp. 47-55
- Stevens, K.¹

57
- 0036165806
- An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition
- DOI 10.1121/1.1420380
- Sun, J. P., and Deng, L. (2002), An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition., J. Acoust. Soc. Am. 111 (2), 1086-1101. 10.1121/1.1420380 (Pubitemid 34127489)
- (2002) Journal of the Acoustical Society of America , vol.111 , Issue.2 , pp. 1086-1101
- Sun, J.¹ Deng, L.²

58
- 33745288610
- A support vector approach to the acoustic-to- articulatory mapping
- Lisbon, Portugal
- Toutios, A., and Margaritis, K. (2005), A support vector approach to the acoustic-to- articulatory mapping., Proceedings of Interspeech, Lisbon, Portugal, pp. 3221-3224.
- (2005) Proceedings of Interspeech , pp. 3221-3224
- Toutios, A.¹ Margaritis, K.²

59
- 0033097443
- Single channel speech enhancement based on masking properties of the human auditory system
- 10.1109/89.748118
- Virag, N. (1999), Single channel speech enhancement based on masking properties of the human auditory system., IEEE Trans. Speech Audio Process. 7 (2), 126-137. 10.1109/89.748118
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.2 , pp. 126-137
- Virag, N.¹

60
- 0003652255
- University of Wisconsin, Madison
- Westbury, J. (1994), X-ray microbeam speech production database user's handbook., University of Wisconsin, Madison.
- (1994) X-ray Microbeam Speech Production Database User's Handbook
- Westbury, J.¹

61
- 0037503670
- A multichannel articulatory database and its application for automatic speech recognition
- Bavaria, Germany
- Wrench, A. A., and Hardcastle, W. J. (2000), A multichannel articulatory database and its application for automatic speech recognition., in 5th Seminar on Speech Production: Models and Data, Bavaria, Germany, pp. 305-308.
- (2000) 5th Seminar on Speech Production: Models and Data , pp. 305-308
- Wrench, A.A.¹ Hardcastle, W.J.²

62
- 77955810460
- A study on the generalization capability of acoustic models for robust speech recognition
- 10.1109/TASL.2009.2031236
- Xiao, X., Li, J., Chng, E. S., Li, H., and Lee, C. (2010), A study on the generalization capability of acoustic models for robust speech recognition., IEEE Trans. Audio Speech Lang. Process. 18 (6), 1158-1169. 10.1109/TASL.2009. 2031236
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.6 , pp. 1158-1169
- Xiao, X.¹ Li, J.² Chng, E.S.³ Li, H.⁴ Lee, C.⁵

63
- 70450174439
- Articulatory phonological code for word classification
- Brighton, UK
- Zhuang, X., Nam, H., Hasegawa-Johnson, M., Goldstein, L., and Saltzman, E. (2009), Articulatory phonological code for word classification., Proceedings of Interspeech, Brighton, UK, pp. 2763-2766.
- (2009) Proceedings of Interspeech , pp. 2763-2766
- Zhuang, X.¹ Nam, H.² Hasegawa-Johnson, M.³ Goldstein, L.⁴ Saltzman, E.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.