메뉴 건너뛰기




Volumn 19, Issue 7, 2011, Pages 1913-1924

Articulatory information for noise robust speech recognition

Author keywords

Articulatory phonology; articulatory speech recognition; artificial neural networks (ANNs); noise robust speech recognition; speech inversion; task dynamic model; vocal tract variables

Indexed keywords

ARTICULATORY PHONOLOGY; ARTICULATORY SPEECH RECOGNITION; ARTIFICIAL NEURAL NETWORKS (ANNS); NOISE-ROBUST SPEECH RECOGNITION; SPEECH INVERSION; TASK DYNAMIC MODEL; VOCAL-TRACTS;

EID: 79960545035     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2010.2103058     Document Type: Article
Times cited : (56)

References (67)
  • 1
    • 0345843991 scopus 로고
    • Experiments with a non linear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars
    • P. Lockwood and J. Boudy, "Experiments with a non linear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars," in Proc. Eurospeech, 1991, pp. 79-82.
    • (1991) Proc. Eurospeech , pp. 79-82
    • Lockwood, P.1    Boudy, J.2
  • 2
    • 56249136428 scopus 로고    scopus 로고
    • Transforming binary uncertainties for robust speech recognition
    • Sep.
    • S. Srinivasan and D. L. Wang, "Transforming binary uncertainties for robust speech recognition," IEEE Trans Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 2130-2140, Sep. 2007.
    • (2007) IEEE Trans Audio, Speech, Lang. Process. , vol.15 , Issue.7 , pp. 2130-2140
    • Srinivasan, S.1    Wang, D.L.2
  • 4
    • 52949093125 scopus 로고    scopus 로고
    • Combined speech enhancement and auditory modelling for robust distributed speech recognition
    • R. Flynn and E. Jones, "Combined speech enhancement and auditory modelling for robust distributed speech recognition," Speech Commun., vol. 50, pp. 797-809, 2008.
    • (2008) Speech Commun. , vol.50 , pp. 797-809
    • Flynn, R.1    Jones, E.2
  • 6
  • 8
    • 4544286862 scopus 로고    scopus 로고
    • Entropy-based variable frame rate analysis of speech signals and its application to ASR
    • H. You, Q. Zhu, and A. Alwan, "Entropy-based variable frame rate analysis of speech signals and its application to ASR," in Proc. ICASSP, 2004, pp. 549-552.
    • (2004) Proc. ICASSP , pp. 549-552
    • You, H.1    Zhu, Q.2    Alwan, A.3
  • 9
    • 0031238095 scopus 로고    scopus 로고
    • A model of dynamic auditory perception and its application to Robust Word recognition
    • PII S1063667697063906
    • B. Strope and A. Alwan, "A model of dynamic auditory perception and its application to robust word recognition," IEEE Trans. Speech Audio Process., vol. 5, no. 5, pp. 451-464, Sep. 1997. (Pubitemid 127746017)
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , Issue.5 , pp. 451-464
    • Strope, B.1    Alwan, A.2
  • 12
    • 17344389852 scopus 로고    scopus 로고
    • Robust speech recognition in noisy environments: The 2001 IBM spin evaluation system
    • B. Kingsbury, G. Saon, L. Mangu, M. Padmanabhan, and R. Sarikaya, "Robust speech recognition in noisy environments: The 2001 IBM spin evaluation system," in Proc. ICASSP, 2002, vol. 1, pp. I-53-I-56.
    • (2002) Proc. ICASSP , vol.1
    • Kingsbury, B.1    Saon, G.2    Mangu, L.3    Padmanabhan, M.4    Sarikaya, R.5
  • 13
    • 0030245128 scopus 로고    scopus 로고
    • Robust continuous speech recognition using parallel model combination
    • PII S1063667696067120
    • M. J. F. Gales and S. J. Young, "Robust continuous speech recognition using parallel model combination," IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 352-359, Sep. 1996. (Pubitemid 126753023)
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.5 , pp. 352-359
    • Gales, M.J.F.1    Young, S.J.2
  • 14
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. Leggetter and P.Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput., Speech Lang., vol. 9, pp. 171-185, 1995.
    • (1995) Comput., Speech Lang. , vol.9 , pp. 171-185
    • Leggetter, C.1    Woodland, P.2
  • 15
    • 0347899508 scopus 로고    scopus 로고
    • Piecewise-linear transformation-based HMM adaptation for noisy speech
    • Jan.
    • Z. Zhang and S. Furui, "Piecewise-linear transformation-based HMM adaptation for noisy speech," Speech Commun., vol. 42, no. 1, pp. 43-58, Jan. 2004.
    • (2004) Speech Commun. , vol.42 , Issue.1 , pp. 43-58
    • Zhang, Z.1    Furui, S.2
  • 16
    • 0035342414 scopus 로고    scopus 로고
    • Robust automatic speech recognition with missing and unreliable acoustic data
    • DOI 10.1016/S0167-6393(00)00034-0, PII S0167639300000340
    • M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and uncertain acoustic data," Speech Commun., vol. 34, pp. 267-285, 2001. (Pubitemid 32284867)
    • (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
    • Cooke, M.1    Green, P.2    Josifovski, L.3    Vizinho, A.4
  • 18
    • 0037841203 scopus 로고    scopus 로고
    • State based imputation of missing data for robust speech recognition and speech enhancement
    • L. Josifovski, M. Cooke, P. Green, and A. Vizinho, "State based imputation of missing data for robust speech recognition and speech enhancement," in Proc. Eurospeech, 1999, vol. 6, pp. 2833-2836.
    • (1999) Proc. Eurospeech , vol.6 , pp. 2833-2836
    • Josifovski, L.1    Cooke, M.2    Green, P.3    Vizinho, A.4
  • 20
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition
    • DOI 10.1121/1.1420380
    • J. Sun and L. Deng, "An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition," J. Acoust. Soc. Amer., vol. 111, no. 2, pp. 1086-1101, Feb. 2002. (Pubitemid 34127489)
    • (2002) Journal of the Acoustical Society of America , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.1    Deng, L.2
  • 22
    • 84939672029 scopus 로고
    • Toward a model for speech recognition
    • K. N. Stevens, "Toward a model for speech recognition," J. Acoust. Soc. Amer., vol. 32, pp. 47-55, 1960.
    • (1960) J. Acoust. Soc. Amer. , vol.32 , pp. 47-55
    • Stevens, K.N.1
  • 23
    • 0001887625 scopus 로고
    • Performing fine phonetic distinctions: Templates versus features
    • Hillsdale, NJ: Lawrence Erlbaum Assoc., ch. 15
    • R. Cole, R. M. Stern, and M. J. Lasry, , J. S. Perkell and D. Klatt, Eds., "Performing fine phonetic distinctions: Templates versus features," in Invariance and Variability of Speech Processes. Hillsdale, NJ: Lawrence Erlbaum Assoc., 1986, ch. 15, pp. 325-345.
    • (1986) Invariance and Variability of Speech Processes , pp. 325-345
    • Cole, R.1    Stern, R.M.2    Lasry, M.J.3    Perkell, J.S.4    Klatt, D.5
  • 24
    • 0020300423 scopus 로고
    • Acoustic-phonetic analysis based on an articulatory model
    • J. P. Hayton, Ed. Dordrecht, The Netherlands: D. Reidel
    • B. Lochschmidt, "Acoustic-phonetic analysis based on an articulatory model," in Automatic Speech Analysis and Recognition, J. P. Hayton, Ed. Dordrecht, The Netherlands: D. Reidel, 1982, pp. 139-152.
    • (1982) Automatic Speech Analysis and Recognition , pp. 139-152
    • Lochschmidt, B.1
  • 25
    • 0017007706 scopus 로고
    • Automatic detection and description of syllabic features in continuous speech
    • Oct.
    • R. D. Mori, P. Laface, and E. Piccolo, "Automatic detection and description of syllabic features in continuous speech," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-24, no. 5, pp. 365-379, Oct. 1976.
    • (1976) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-24 , Issue.5 , pp. 365-379
    • Mori, R.D.1    Laface, P.2    Piccolo, E.3
  • 27
    • 0026854213 scopus 로고
    • A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal
    • L. Deng, "A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal," Signal Process., vol. 27, no. 1, pp. 65-78, 1992.
    • (1992) Signal Process. , vol.27 , Issue.1 , pp. 65-78
    • Deng, L.1
  • 28
    • 0028234947 scopus 로고
    • A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features
    • DOI 10.1121/1.409839
    • L. Deng and D. Sun, "A statistical approach to ASR using atomic units constructed from overlapping articulatory features," J. Acoust. Soc. Amer., vol. 95, pp. 2702-2719, 1994. (Pubitemid 24152864)
    • (1994) Journal of the Acoustical Society of America , vol.95 , Issue.5 , pp. 2702-2719
    • Deng, L.1    Sun, D.X.2
  • 29
    • 0027627252 scopus 로고
    • Hidden Markov model representation of quantized articulatory features for speech recognition
    • DOI 10.1006/csla.1993.1014
    • K. Erler and L. Deng, "Hidden Markov model representation of quantized articulatory features for speech recognition," Comput., Speech Lang., vol. 7, pp. 265-282, 1993. (Pubitemid 23705305)
    • (1993) Computer Speech and Language , vol.7 , Issue.3 , pp. 265-282
    • Erler Kevin1    Deng, L.2
  • 30
    • 0034297586 scopus 로고    scopus 로고
    • Detection of phonological features in continuous speech using neural networks
    • Oct.
    • S. King and P. Taylor, "Detection of phonological features in continuous speech using neural networks," Comput. Speech Lang., vol. 14, no. 4, pp. 333-353, Oct. 2000.
    • (2000) Comput. Speech Lang. , vol.14 , Issue.4 , pp. 333-353
    • King, S.1    Taylor, P.2
  • 36
    • 0017968519 scopus 로고
    • Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique
    • B. S. Atal, J. J. Chang, M. V. Mathews, and J. W. Tukey, "Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer sorting technique," J. Acoust. Soc. Amer., vol. 63, pp. 1535-1555, 1978. (Pubitemid 8346208)
    • (1978) Journal of the Acoustical Society of America , vol.63 , Issue.5 , pp. 1535-1555
    • Atal, B.S.1    Chang, J.J.2    Mathews, M.V.3    Tukey, J.W.4
  • 38
    • 0026675669 scopus 로고
    • Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data
    • G. Papcun, J. Hochberg, T. R. Thomas, F. Laroche, J. Zachs, and S. Levy, "Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data," J. Acoust. Soc. Amer., vol. 92, no. 2, pp. 688-700, 1992.
    • (1992) J. Acoust. Soc. Amer. , vol.92 , Issue.2 , pp. 688-700
    • Papcun, G.1    Hochberg, J.2    Thomas, T.R.3    Laroche, F.4    Zachs, J.5    Levy, S.6
  • 41
    • 0010505818 scopus 로고    scopus 로고
    • Recovery of articulatory movements from acoustics with phonemic information
    • Bavaria, Germany
    • T. Okadome, S. Suzuki, and M. Honda, "Recovery of articulatory movements from acoustics with phonemic information," in Proc. 5th Seminar Speech Prod., Bavaria, Germany, 2000, pp. 229-232.
    • (2000) Proc. 5th Seminar Speech Prod. , pp. 229-232
    • Okadome, T.1    Suzuki, S.2    Honda, M.3
  • 42
    • 51449098747 scopus 로고    scopus 로고
    • An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping
    • C. Qin and M. Á. Carreira-Perpiñán, "An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping," in Proc. Interspeech, 2007, pp. 74-77.
    • (2007) Proc. Interspeech , pp. 74-77
    • Qin, C.1    Carreira-Perpinán, M.A.2
  • 43
    • 84867222549 scopus 로고    scopus 로고
    • The acoustic to articulation mapping: Non-Linear or non-unique?
    • D. Neiberg, G. Ananthakrishnan, and O. Engwall, "The acoustic to articulation mapping: Non-Linear or non-unique?," in Proc. Interspeech, 2008, pp. 1485-1488.
    • (2008) Proc. Interspeech , pp. 1485-1488
    • Neiberg, D.1    Ananthakrishnan, G.2    Engwall, O.3
  • 44
    • 58849145971 scopus 로고    scopus 로고
    • ASR-Articulatory speech recognition
    • J. Frankel and S. King, "ASR-Articulatory speech recognition," in Proc. Eurospeech, 2001, pp. 599-602.
    • (2001) Proc. Eurospeech , pp. 599-602
    • Frankel, J.1    King, S.2
  • 45
    • 84994254645 scopus 로고    scopus 로고
    • An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces
    • J. Frankel, K. Richmond, S. King, and P. Taylor, "An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces," in Proc. ICSLP, 2000, vol. 4, pp. 254-257.
    • (2000) Proc. ICSLP , vol.4 , pp. 254-257
    • Frankel, J.1    Richmond, K.2    King, S.3    Taylor, P.4
  • 46
    • 0001622923 scopus 로고
    • On defining coarticulation
    • R. Daniloff and R. Hammarberg, "On defining coarticulation," J. Phon., vol. 1, pp. 239-248, 1973.
    • (1973) J. Phon. , vol.1 , pp. 239-248
    • Daniloff, R.1    Hammarberg, R.2
  • 47
    • 0000523613 scopus 로고
    • Towards an articulatory phonology
    • C. P. Browman and L. Goldstein, "Towards an articulatory phonology," Phonol. Yearbook, vol. 85, pp. 219-252, 1986.
    • (1986) Phonol. Yearbook , vol.85 , pp. 219-252
    • Browman, C.P.1    Goldstein, L.2
  • 48
  • 49
    • 77956779481 scopus 로고
    • A dynamical approach to gestural patterning in speech production
    • E. Saltzman and K. Munhall, "A dynamical approach to gestural patterning in speech production," Ecol. Psychol., vol. 1, no. 4, pp. 332-382, 1989.
    • (1989) Ecol. Psychol. , vol.1 , Issue.4 , pp. 332-382
    • Saltzman, E.1    Munhall, K.2
  • 50
    • 70349207706 scopus 로고    scopus 로고
    • TADA: An enhanced, portable task dynamics model in Matlab
    • H. Nam, L. Goldstein, E. Saltzman, and D. Byrd, "TADA: An enhanced, portable task dynamics model in Matlab," J. Acoust. Soc. Amer., vol. 115, no. 5, p. 2430, 2004.
    • (2004) J. Acoust. Soc. Amer. , vol.115 , Issue.5 , pp. 2430
    • Nam, H.1    Goldstein, L.2    Saltzman, E.3    Byrd, D.4
  • 52
    • 0028375762 scopus 로고
    • Recovering articulatory movement from formant frequency trajectories using task dynamics and a genetic algorithm: Preliminary model tests
    • Feb.
    • R. S. McGowan, "Recovering articulatory movement from formant frequency trajectories using task dynamics and a genetic algorithm: Preliminary model tests," Speech Commun., vol. 14, no. 1, pp. 19-48, Feb. 1994.
    • (1994) Speech Commun. , vol.14 , Issue.1 , pp. 19-48
    • McGowan, R.S.1
  • 53
    • 78649390043 scopus 로고    scopus 로고
    • Retrieving tract variables from acoustics: A comparison of different machine learning strategies
    • Dec.
    • V. Mitra, H. Nam, C. Espy-Wilson, E. Saltzman, and L. Goldstein, "Retrieving tract variables from acoustics: A comparison of different machine learning strategies," IEEE J. Sel. Topics Signal Process., vol. 4, no. 6, pp. 1027-1045, Dec. 2010.
    • (2010) IEEE J. Sel. Topics Signal Process. , vol.4 , Issue.6 , pp. 1027-1045
    • Mitra, V.1    Nam, H.2    Espy-Wilson, C.3    Saltzman, E.4    Goldstein, L.5
  • 55
    • 0036642567 scopus 로고    scopus 로고
    • Combining acoustic and articulatory feature information for robust speech recognition
    • DOI 10.1016/S0167-6393(01)00020-6, PII S0167639301000206
    • K. Kirchhoff, G. A. Fink, and G. Sagerer, "Combining acoustic and articulatory feature information for robust speech recognition," Speech Commun., vol. 37, no. 3-4, pp. 303-319, Jul. 2002. (Pubitemid 34524845)
    • (2002) Speech Communication , vol.37 , Issue.3-4 , pp. 303-319
    • Kirchhoff, K.1    Fink, G.A.2    Sagerer, G.3
  • 56
    • 0037697284 scopus 로고    scopus 로고
    • Hidden-articulator Markov models for speech recognition
    • Oct.
    • M. Richardson, J. Bilmes, and C. Diorio, "Hidden-articulator Markov models for speech recognition," Speech Commun., vol. 41, no. 2-3, pp. 511-529, Oct. 2003.
    • (2003) Speech Commun. , vol.41 , Issue.2-3 , pp. 511-529
    • Richardson, M.1    Bilmes, J.2    Diorio, C.3
  • 57
    • 70450200298 scopus 로고    scopus 로고
    • Noise robustness of tract variables and their application to speech recognition
    • V. Mitra, H. Nam, C. Espy-Wilson, E. Saltzman, and L. Goldstein, "Noise robustness of tract variables and their application to speech recognition," in Proc. Interspeech, 2009, pp. 2759-2762.
    • (2009) Proc. Interspeech , pp. 2759-2762
    • Mitra, V.1    Nam, H.2    Espy-Wilson, C.3    Saltzman, E.4    Goldstein, L.5
  • 58
    • 0036711819 scopus 로고    scopus 로고
    • A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn
    • DOI 10.1121/1.1498851
    • H. M. Hanson and K. N. Stevens, "A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn," J. Acoust. Soc. Amer., vol. 112, no. 3, pp. 1158-1182, 2002. (Pubitemid 35006671)
    • (2002) Journal of the Acoustical Society of America , vol.112 , Issue.3 , pp. 1158-1182
    • Hanson, H.M.1    Stevens, K.N.2
  • 59
    • 0038669544 scopus 로고    scopus 로고
    • The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
    • Paris, France
    • D. Pearce and H. G. Hirsch, "The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions," in Proc. Autom. Speech Recognition: Challenges For New Millenium, ASR-2000, Paris, France, 2000, pp. 181-188.
    • (2000) Proc. Autom. Speech Recognition: Challenges For New Millenium, ASR-2000 , pp. 181-188
    • Pearce, D.1    Hirsch, H.G.2
  • 64
    • 70450212952 scopus 로고    scopus 로고
    • A noisetype and level-dependent MPO-based speech enhancement architecture with variable frame analysis for noise-robust speech recognition
    • Brighton, U.K.
    • V. Mitra, B. J. Borgstrom, C. Espy-Wilson, and A. Alwan, "A noisetype and level-dependent MPO-based speech enhancement architecture with variable frame analysis for noise-robust speech recognition," in Proc. Interspeech, Brighton, U.K., 2009, pp. 2751-2754.
    • (2009) Proc. Interspeech , pp. 2751-2754
    • Mitra, V.1    Borgstrom, B.J.2    Espy-Wilson, C.3    Alwan, A.4
  • 65
    • 0021892216 scopus 로고
    • Speech enhancement using a minimum mean square log-spectral amplitude estimator
    • Apr.
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean square log-spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-33, no. 2, pp. 443-445, Apr. 1985.
    • (1985) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-33 , Issue.2 , pp. 443-445
    • Ephraim, Y.1    Malah, D.2
  • 66
    • 77955810460 scopus 로고    scopus 로고
    • A study on the generalization capability of acoustic models for robust speech recognition
    • Aug.
    • X. Xiao, J. Li, E. S. Chng, H. Li, and C. Lee, "A study on the generalization capability of acoustic models for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 6, pp. 1158-1169, Aug. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.6 , pp. 1158-1169
    • Xiao, X.1    Li, J.2    Chng, E.S.3    Li, H.4    Lee, C.5
  • 67
    • 51849099743 scopus 로고    scopus 로고
    • A study of variable-parameter gaussian mixture hidden Markov modeling for noisy speech recognition
    • May
    • X. Cui and Y. Gong, "A study of variable-parameter gaussian mixture hidden Markov modeling for noisy speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1366-1376, May 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.4 , pp. 1366-1376
    • Cui, X.1    Gong, Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.