SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 3017-3021

Articulatory features from deep neural networks and their role in speech recognition

(5) Mitra, Vikramjit a Sivaraman, Ganesh b Nam, Hosung c Espy Wilson, Carol b Saltzman, Elliot c,d

a SRI INTERNATIONAL (United States)

b UNIVERSITY OF MARYLAND (United States)

c HASKINS LABORATORIES (United States)

d BOSTON UNIVERSITY (United States)

Author keywords

articulatory trajectories; automatic speech recognition; deep neural networks; vocal tract variables

Indexed keywords

CONTINUOUS SPEECH RECOGNITION; SIGNAL PROCESSING; TRAJECTORIES;

ARTICULATORY FEATURES; ARTICULATORY INFORMATIONS; AUTOMATIC SPEECH RECOGNITION; BASE-LINE PERFORMANCE; DEEP NEURAL NETWORKS; SPEECH RECOGNITION PERFORMANCE; TRAJECTORY ESTIMATION; VOCAL-TRACTS;

SPEECH;

EID: 84905234271 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854154 Document Type: Conference Paper

Times cited : (45)

References (30)

1
- 0001622923
- On defining coarticulation
- R. Daniloff and R. Hammarberg, "On defining coarticulation", J. of Phonetics, Vol.1, pp. 239-248, 1973.
- (1973) J. of Phonetics , vol.1 , pp. 239-248
- Daniloff, R.¹ Hammarberg, R.²

2
- 84939672029
- Toward a model for speech recognition
- K. N. Stevens, "Toward a model for speech recognition", J. of Acoust. Soc. Am., Vol.32, pp. 47-55, 1960.
- (1960) J. of Acoust. Soc. Am. , vol.32 , pp. 47-55
- Stevens, K.N.¹

3
- 0003424928
- PhD Thesis, University of Bielefeld
- K. Kirchhoff, Robust Speech Recognition Using Articulatory Information, PhD Thesis, University of Bielefeld, 1999.
- (1999) Robust Speech Recognition Using Articulatory Information
- Kirchhoff, K.¹

4
- 58849145971
- ASR-articulatory speech recognition
- Denmark
- J. Frankel and S. King, "ASR-Articulatory Speech Recognition", Proc. of Eurospeech, pp. 599-602, Denmark, 2001.
- (2001) Proc. of Eurospeech , pp. 599-602
- Frankel, J.¹ King, S.²

5
- 0028234947
- A statistical approach to automatic speech recognition using atomic units constructed from overlapping articulatory features
- L. Deng and D. Sun, "A statistical approach to automatic speech recognition using atomic units constructed from overlapping articulatory features", J. of Acoust. Soc. Am., 95(5), pp. 2702-2719, 1994.
- (1994) J. of Acoust. Soc. Am. , vol.95 , Issue.5 , pp. 2702-2719
- Deng, L.¹ Sun, D.²

6
- 33846680938
- Speech production knowledge in automatic speech recognition
- S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond and M. Wester, "Speech production knowledge in automatic speech recognition", J. Acoust. Soc. of Am., 121(2), pp. 723-742, 2007.
- (2007) J. Acoust. Soc. of Am. , vol.121 , Issue.2 , pp. 723-742
- King, S.¹ Frankel, J.² Livescu, K.³ McDermott, E.⁴ Richmond, K.⁵ Wester, M.⁶

7
- 34547541459
- Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop
- K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss and K. Saenko, "Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop", Proc. of ICASSP, Vol.4, pp.621-624, 2007.
- (2007) Proc. of ICASSP , vol.4 , pp. 621-624
- Livescu, K.¹ Cetin, O.² Hasegawa-Johnson, M.³ King, S.⁴ Bartels, C.⁵ Borges, N.⁶ Kantor, A.⁷ Lal, P.⁸ Yung, L.⁹ Bezman, A.¹⁰ Dawson-Haggerty, S.¹¹ Woods, B.¹² Frankel, J.¹³ Magimai-Doss, M.¹⁴ Saenko, K.¹⁵

8
- 0037697284
- Hidden-articulator Markov models for speech recognition
- M. Richardson, J. Bilmes and C. Diorio, "Hidden-articulator Markov models for speech recognition", Speech Comm., 41(2-3), pp. 511-529, 2003.
- (2003) Speech Comm. , vol.41 , Issue.2-3 , pp. 511-529
- Richardson, M.¹ Bilmes, J.² Diorio, C.³

9
- 79960545035
- Articulatory information for noise robust speech recognition
- Iss. 7
- V. Mitra, H. Nam, C. Espy-Wilson, E. Saltzman and L. Goldstein, "Articulatory information for noise robust speech recognition", IEEE Trans. on Audio, Speech and Language Processing, Vol. 19, Iss. 7, pp. 1913-1924, 2010.
- (2010) IEEE Trans. on Audio, Speech and Language Processing , vol.19 , pp. 1913-1924
- Mitra, V.¹ Nam, H.² Espy-Wilson, C.³ Saltzman, E.⁴ Goldstein, L.⁵

10
- 84858964876
- Robust speech recognition using articulatory gestures in a Dynamic Bayesian Network framework
- Hawaii
- V. Mitra, H. Nam and C. Espy-Wilson, "Robust speech recognition using articulatory gestures in a Dynamic Bayesian Network framework", Proc. of Automatic Speech Recognition & Understanding Workshop, ASRU, pp. 131-136, Hawaii, 2011.
- (2011) Proc. of Automatic Speech Recognition & Understanding Workshop, ASRU , pp. 131-136
- Mitra, V.¹ Nam, H.² Espy-Wilson, C.³

11
- 84890508727
- Articulatory features for large vocabulary speech recognition
- Vancouver, May
- V. Mitra, W. Wang, A. Stolcke, H. Nam, C. Richey, J. Juan, and M. Liberman, "Articulatory features for large vocabulary speech recognition," in Proc. IEEE ICASSP, Vancouver, May 2013.
- (2013) Proc. IEEE ICASSP
- Mitra, V.¹ Wang, W.² Stolcke, A.³ Nam, H.⁴ Richey, C.⁵ Juan, J.⁶ Liberman, M.⁷

12
- 84874282835
- A deep neural network for acoustic-articulatory speech inversion
- B. Uria, S. Renals, and K. Richmond, "A Deep Neural Network for Acoustic-Articulatory Speech Inversion", in NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
- (2011) NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning
- Uria, B.¹ Renals, S.² Richmond, K.³

13
- 84906219170
- Relevanceweighted reconstruction of articulatory features in deep neural network-based acoustic-to-articulatory mapping
- Canevari, C., Badino, L., Fadiga, L., Metta, G., "Relevanceweighted reconstruction of articulatory features in Deep Neural Network-based Acoustic-to-Articulatory Mapping", in Proc. of Interspeech, 2013.
- (2013) Proc. of Interspeech
- Canevari, C.¹ Badino, L.² Fadiga, L.³ Metta, G.⁴

14
- 80051649631
- Gesture-based dynamic bayesian network for noise robust speech recognition
- V. Mitra, H. Nam, C. Espy-Wilson, E. Saltzman, L. Goldstein, "Gesture-based Dynamic Bayesian Network for Noise robust Speech Recognition," in Proc. of ICASSP, pp. 5172-5175, 2011.
- (2011) Proc. of ICASSP , pp. 5172-5175
- Mitra, V.¹ Nam, H.² Espy-Wilson, C.³ Saltzman, E.⁴ Goldstein, L.⁵

15
- 0028234947
- A statistical approach to ASR using atomic units constructed from overlapping articulatory features
- L. Deng and D. Sun, "A statistical approach to ASR using atomic units constructed from overlapping articulatory features", J. of Acoust. Soc. Am., 95, pp. 2702-2719, 1994.
- (1994) J. of Acoust. Soc. Am. , vol.95 , pp. 2702-2719
- Deng, L.¹ Sun, D.²

16
- 0027627252
- Hidden Markov model representation of quantized articulatory features for speech recognition
- K. Erler and L. Deng, "Hidden Markov model representation of quantized articulatory features for speech recognition", Comp., Speech & Lang., Vol. 7, pp. 265-282, 1993.
- (1993) Comp., Speech & Lang. , vol.7 , pp. 265-282
- Erler, K.¹ Deng, L.²

17
- 70349207706
- Tada: An enhanced, portable task dynamics model in Matlab
- H. Nam, L. Goldstein, E. Saltzman and D. Byrd, "Tada: An enhanced, portable task dynamics model in Matlab", J. of Acoust. Soc. Am., 115(5), pp. 2430, 2004.
- (2004) J. of Acoust. Soc. Am. , vol.115 , Issue.5 , pp. 2430
- Nam, H.¹ Goldstein, L.² Saltzman, E.³ Byrd, D.⁴

18
- 0036711819
- A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn
- H. M. Hanson and K. N. Stevens, "A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn", J. of Acoust. Soc. Am., 112(3), pp. 1158-1182, 2002.
- (2002) J. of Acoust. Soc. Am. , vol.112 , Issue.3 , pp. 1158-1182
- Hanson, H.M.¹ Stevens, K.N.²

19
- 84955548400
- Towards an articulatory phonology
- C. P. Browman and L. Goldstein, "Towards an Articulatory Phonology", Phonology Yearbook, 85, pp. 219-252, 1986.
- (1986) Phonology Yearbook , vol.85 , pp. 219-252
- Browman, C.P.¹ Goldstein, L.²

20
- 0027024362
- Articulatory phonology: An overview
- C. P. Browman and L. Goldstein, "Articulatory Phonology: An Overview", Phonetica, 49, pp. 155-180, 1992.
- (1992) Phonetica , vol.49 , pp. 155-180
- Browman, C.P.¹ Goldstein, L.²

21
- 77956779481
- A dynamical approach to gestural patterning in speech production
- E. Saltzman and K. Munhall, "A Dynamical Approach to Gestural Patterning in Speech Production", Ecological Psychology, 1(4), pp. 332-382, 1989.
- (1989) Ecological Psychology , vol.1 , Issue.4 , pp. 332-382
- Saltzman, E.¹ Munhall, K.²

22
- 84905246750
- http://www.speech.cs.cmu.edu/cgi-bin/cmudict

23
- 33646677283
- Experimental framework for the performance evaluation of speech recognition front-ends on a large vocabulary task
- June 4
- G. Hirsch, "Experimental framework for the performance evaluation of speech recognition front-ends on a large vocabulary task", ETSI STQ-Aurora DSR Working Group, June 4, 2001.
- (2001) ETSI STQ-Aurora DSR Working Group
- Hirsch, G.¹

24
- 0028517164
- RASTA processing of speech
- H. Hermansky and N. Morgan, "RASTA processing of speech," IEEE Trans. Speech Audio Proc., vol.2, pp.578-589, 1994.
- (1994) IEEE Trans. Speech Audio Proc. , vol.2 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

25
- 84867589420
- Normalized amplitude modulation features for large vocabulary noise-robust speech recognition
- V. Mitra, H. Franco, M. Graciarena and A. Mandal, "Normalized amplitude modulation features for large vocabulary noise-robust speech recognition", Proc. IEEE CASSP, pp. 4117-4120, 2012.
- (2012) Proc. IEEE CASSP , pp. 4117-4120
- Mitra, V.¹ Franco, H.² Graciarena, M.³ Mandal, A.⁴

26
- 84906260861
- Damped oscillator cepstral coefficients for robust speech recognition
- V. Mitra, H. Franco, M. Graciarena, "Damped Oscillator Cepstral Coefficients for Robust Speech Recognition," Proc. Interspeech, pp. 886-890, 2013.
- (2013) Proc. Interspeech , pp. 886-890
- Mitra, V.¹ Franco, H.² Graciarena, M.³

27
- 78649390043
- Retrieving tract variables from acoustics: A comparison of different machine learning strategies
- V. Mitra, H. Nam, C. Espy-Wilson, E. Saltzman and L. Goldstein, "Retrieving tract variables from acoustics: A comparison of different machine learning strategies", IEEE Journal of Selected Topics on Signal Processing, Sp. Iss. on Statistical Learning Methods for Speech and Language Processing, Vol. 4, Iss. 6, pp. 1027-1045, 2010.
- (2010) IEEE Journal of Selected Topics on Signal Processing, Sp. Iss. on Statistical Learning Methods for Speech and Language Processing , vol.4 , Issue.6 , pp. 1027-1045
- Mitra, V.¹ Nam, H.² Espy-Wilson, C.³ Saltzman, E.⁴ Goldstein, L.⁵

28
- 4243714433
- PhD Thesis, Univ. of Edinburgh, UK
- K. Richmond, "Estimating Articulatory parameters from the Acoustic Speech Signal", PhD Thesis, Univ. of Edinburgh, UK, 2001.
- (2001) Estimating Articulatory Parameters from the Acoustic Speech Signal
- Richmond, K.¹

29
- 34047270914
- Recent innovations in speech-to-text transcription at sri-icsi-uw
- A. Stolcke, B. Chen, H. Franco, V. R. R. Gadde, M. Graciarena, M.-Y. Hwang, K. Kirchhoff, A. Mandal, N. Morgan, X. Lin, T. Ng, M. Ostendorf, K. Sonmez, A. Venkataraman, D. Vergyri, W. Wang, J. Zheng and Q. Zhu, "Recent Innovations in Speech-to-Text Transcription at SRI-ICSI-UW", IEEE Trans. on Audio, Speech and Language Processing, 14(5), pp. 1729-1744, 2006.
- (2006) IEEE Trans. on Audio, Speech and Language Processing , vol.14 , Issue.5 , pp. 1729-1744
- Stolcke, A.¹ Chen, B.² Franco, H.³ Gadde, V.R.R.⁴ Graciarena, M.⁵ Hwang, M.-Y.⁶ Kirchhoff, K.⁷ Mandal, A.⁸ Morgan, N.⁹ Lin, X.¹⁰ Ng, T.¹¹ Ostendorf, M.¹² Sonmez, K.¹³ Venkataraman, A.¹⁴ Vergyri, D.¹⁵ Wang, W.¹⁶ Zheng, J.¹⁷ Zhu, Q.¹⁸

30
- 53849127143
- Improving robustness of MLLR adaptation with speaker-clustered regression class trees
- ISSN 0885-2308
- A. Mandal, M. Ostendorf and Andreas Stolcke, "Improving robustness of MLLR adaptation with speaker-clustered regression class trees", Computer Speech & Language, 23, pp. 176 199 (2009). ISSN 0885-2308.
- (2009) Computer Speech & Language , vol.23 , pp. 176-199
- Mandal, A.¹ Ostendorf, M.² Stolcke, A.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.