메뉴 건너뛰기




Volumn 36, Issue , 2016, Pages 260-273

Data driven articulatory synthesis with deep neural networks

Author keywords

Articulatory synthesis; Deep learning; Electromagnetic articulography; Gaussian mixture models

Indexed keywords

AERODYNAMICS; COMMUNICATION CHANNELS (INFORMATION THEORY); MAPPING; OBJECT RECOGNITION;

EID: 84949568676     PISSN: 08852308     EISSN: 10958363     Source Type: Journal    
DOI: 10.1016/j.csl.2015.02.003     Document Type: Article
Times cited : (45)

References (52)
  • 2
    • 84890540774 scopus 로고    scopus 로고
    • Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains
    • R. Arora, and K. Livescu Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains Proceedings of ICASSP 2013 7135 7139
    • (2013) Proceedings of ICASSP , pp. 7135-7139
    • Arora, R.1    Livescu, K.2
  • 3
    • 84890528210 scopus 로고    scopus 로고
    • Articulatory inversion and synthesis: Towards articulatory-based modification of speech
    • S. Aryal, and R. Gutierrez-Osuna Articulatory inversion and synthesis: towards articulatory-based modification of speech Proceedings of ICASSP 2013 7952 7956
    • (2013) Proceedings of ICASSP , pp. 7952-7956
    • Aryal, S.1    Gutierrez-Osuna, R.2
  • 4
    • 84905227268 scopus 로고    scopus 로고
    • Accent conversion through cross-speaker articulatory synthesis
    • S. Aryal, and R. Gutierrez-Osuna Accent conversion through cross-speaker articulatory synthesis Proceedings of ICASSP 2014 7744 7748
    • (2014) Proceedings of ICASSP , pp. 7744-7748
    • Aryal, S.1    Gutierrez-Osuna, R.2
  • 5
    • 84876477729 scopus 로고    scopus 로고
    • Investigation on dimensionality reduction of concatenated features with deep neural network for LVCSR systems
    • Y. Bao, H. Jiang, C. Liu, Y. Hu, and L. Dai Investigation on dimensionality reduction of concatenated features with deep neural network for LVCSR systems Proceedings of ICSP 2012 562 566
    • (2012) Proceedings of ICSP , pp. 562-566
    • Bao, Y.1    Jiang, H.2    Liu, C.3    Hu, Y.4    Dai, L.5
  • 6
    • 33947682260 scopus 로고    scopus 로고
    • Construction and control of a three-dimensional vocal tract model
    • P. Birkholz, D. Jackèl, and B. Kroger Construction and control of a three-dimensional vocal tract model Proceedings of ICASSP 2006 873 876
    • (2006) Proceedings of ICASSP , pp. 873-876
    • Birkholz, P.1    Jackèl, D.2    Kroger, B.3
  • 13
    • 84905265670 scopus 로고    scopus 로고
    • Normalization of articulatory data through Procrustes transformations and analysis-by-synthesis
    • D. Felps, S. Aryal, and R. Gutierrez-Osuna Normalization of articulatory data through Procrustes transformations and analysis-by-synthesis Proceedings of ICASSP 2014 3051 3055
    • (2014) Proceedings of ICASSP , pp. 3051-3055
    • Felps, D.1    Aryal, S.2    Gutierrez-Osuna, R.3
  • 14
    • 84865392230 scopus 로고    scopus 로고
    • Foreign accent conversion through concatenative synthesis in the articulatory domain
    • D. Felps, C. Geng, and R. Gutierrez-Osuna Foreign accent conversion through concatenative synthesis in the articulatory domain IEEE Trans. Audio Speech Lang. Process. 20 2012 2301 2312
    • (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , pp. 2301-2312
    • Felps, D.1    Geng, C.2    Gutierrez-Osuna, R.3
  • 15
    • 65549151799 scopus 로고    scopus 로고
    • How to stretch and shrink vowel systems: Results from a vowel normalization procedure
    • C. Geng, and C. Mooshammer How to stretch and shrink vowel systems: results from a vowel normalization procedure J. Acoust. Soc. Am. 125 2009 3278
    • (2009) J. Acoust. Soc. Am. , vol.125 , pp. 3278
    • Geng, C.1    Mooshammer, C.2
  • 16
    • 80051658443 scopus 로고    scopus 로고
    • A subject-independent acoustic-to-articulatory inversion
    • P.K. Ghosh, and S.S. Narayanan A subject-independent acoustic-to-articulatory inversion Proceedings of ICASSP 2011 4624 4627
    • (2011) Proceedings of ICASSP , pp. 4624-4627
    • Ghosh, P.K.1    Narayanan, S.S.2
  • 17
    • 0024879199 scopus 로고
    • The effective second formant F2′ and the vocal tract front-cavity
    • H. Hermansky, and D.J. Broad The effective second formant F2′ and the vocal tract front-cavity Proceedings of ICASSP 1989 480 483
    • (1989) Proceedings of ICASSP , pp. 480-483
    • Hermansky, H.1    Broad, D.J.2
  • 19
    • 84872506495 scopus 로고    scopus 로고
    • A practical guide to training restricted Boltzmann machines
    • Springer
    • G.E. Hinton A practical guide to training restricted Boltzmann machines Neural Networks: Tricks of the Trade 2012 Springer 599 619
    • (2012) Neural Networks: Tricks of the Trade , pp. 599-619
    • Hinton, G.E.1
  • 20
    • 2142659020 scopus 로고    scopus 로고
    • Estimation of articulatory movements from speech acoustics using an HMM-based speech production model
    • S. Hiroya, and M. Honda Estimation of articulatory movements from speech acoustics using an HMM-based speech production model IEEE Trans. Speech Audio Process. 12 2004 175 185
    • (2004) IEEE Trans. Speech Audio Process. , vol.12 , pp. 175-185
    • Hiroya, S.1    Honda, M.2
  • 21
    • 84905266520 scopus 로고    scopus 로고
    • Deep Boltzmann machines based vehicle recognition
    • A. Hu, H. Li, F. Zhang, and W. Zhang Deep Boltzmann machines based vehicle recognition Proceedings of CCDC 2014 3033 3038
    • (2014) Proceedings of CCDC , pp. 3033-3038
    • Hu, A.1    Li, H.2    Zhang, F.3    Zhang, W.4
  • 23
    • 84905263404 scopus 로고    scopus 로고
    • The electromagnetic articulography Mandarin accented english (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data
    • A. Ji, J. Berry, and M.T. Johnson The electromagnetic articulography mandarin accented english (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data Proceedings of ICASSP 2014 7769 7773
    • (2014) Proceedings of ICASSP , pp. 7769-7773
    • Ji, A.1    Berry, J.2    Johnson, M.T.3
  • 24
    • 0002728725 scopus 로고    scopus 로고
    • Determination of the vocal tract spectrum from the articulatory movements based on the search of an articulatory-acoustic database
    • T. Kaburagi, and M. Honda Determination of the vocal tract spectrum from the articulatory movements based on the search of an articulatory-acoustic database Proceedings of ICSLP 1998 433 436
    • (1998) Proceedings of ICSLP , pp. 433-436
    • Kaburagi, T.1    Honda, M.2
  • 25
    • 0030677481 scopus 로고    scopus 로고
    • Speech representation and transformation using adaptive interpolation of weighted spectrum: Vocoder revisited
    • H. Kawahara Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited Proceedings of ICASSP 1997 1303 1306
    • (1997) Proceedings of ICASSP , pp. 1303-1306
    • Kawahara, H.1
  • 26
    • 6344254321 scopus 로고    scopus 로고
    • A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters
    • C.T. Kello, and D.C. Plaut A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters J. Acoust. Soc. Am. 116 2004 2354 2364
    • (2004) J. Acoust. Soc. Am. , vol.116 , pp. 2354-2364
    • Kello, C.T.1    Plaut, D.C.2
  • 27
    • 0001792343 scopus 로고
    • Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal tract shapes using an articulatory model
    • W. Hardcastle, A. Marchal, Kluwer Academic Publisher Amsterdam
    • S. Maeda Compensatory articulation during speech: evidence from the analysis and synthesis of vocal tract shapes using an articulatory model W. Hardcastle, A. Marchal, Speech Production and Speech Modelling 1990 Kluwer Academic Publisher Amsterdam 131 149
    • (1990) Speech Production and Speech Modelling , pp. 131-149
    • Maeda, S.1
  • 28
    • 0015613574 scopus 로고
    • Articulatory model for the study of speech production
    • P. Mermelstein Articulatory model for the study of speech production J. Acoust. Soc. Am. 53 1973 1070 1082
    • (1973) J. Acoust. Soc. Am. , vol.53 , pp. 1070-1082
    • Mermelstein, P.1
  • 29
    • 84867211725 scopus 로고    scopus 로고
    • Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
    • T. Muramatsu, Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory Proceedings of INTERSPEECH 2008 1076 1079
    • (2008) Proceedings of INTERSPEECH , pp. 1076-1079
    • Muramatsu, T.1    Ohtani, Y.2    Toda, T.3    Saruwatari, H.4    Shikano, K.5
  • 31
    • 42649146508 scopus 로고    scopus 로고
    • On the use of phonetic information for mapping from articulatory movements to vocal tract spectrum
    • I-I
    • K. Nakamura, T. Toda, Y. Nankaku, and K. Tokuda On the use of phonetic information for mapping from articulatory movements to vocal tract spectrum Proceedings of ICASSP 2006 I-I
    • (2006) Proceedings of ICASSP
    • Nakamura, K.1    Toda, T.2    Nankaku, Y.3    Tokuda, K.4
  • 35
    • 84858966477 scopus 로고    scopus 로고
    • A factored conditional random field model for articulatory feature forced transcription
    • R. Prabhavalkar, E. Fosler-Lussier, and K. Livescu A factored conditional random field model for articulatory feature forced transcription Proceedings of ASRU 2011 77 82
    • (2011) Proceedings of ASRU , pp. 77-82
    • Prabhavalkar, R.1    Fosler-Lussier, E.2    Livescu, K.3
  • 36
    • 56149109023 scopus 로고    scopus 로고
    • An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping
    • C. Qin, and M.A. Carreira-Perpinán An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping Proceedings of INTERSPEECH 2007 2300 2303
    • (2007) Proceedings of INTERSPEECH , pp. 2300-2303
    • Qin, C.1    Carreira-Perpinán, M.A.2
  • 37
    • 0038359547 scopus 로고    scopus 로고
    • Modelling the uncertainty in recovering articulation from acoustics
    • K. Richmond, S. King, and P. Taylor Modelling the uncertainty in recovering articulation from acoustics Comput. Speech Lang. 17 2003 153 172
    • (2003) Comput. Speech Lang. , vol.17 , pp. 153-172
    • Richmond, K.1    King, S.2    Taylor, P.3
  • 38
    • 84866095936 scopus 로고    scopus 로고
    • Adaptive kernel canonical correlation analysis for estimation of task dynamics from acoustics
    • F. Rudzicz Adaptive kernel canonical correlation analysis for estimation of task dynamics from acoustics Proceedings of ICASSP 2010 4198 4201
    • (2010) Proceedings of ICASSP , pp. 4198-4201
    • Rudzicz, F.1
  • 39
    • 0022471098 scopus 로고
    • Learning representations by back-propagating errors
    • D. Rumelhart, G. Hinton, and R. Williams Learning representations by back-propagating errors Nature 323 1986 533 536
    • (1986) Nature , vol.323 , pp. 533-536
    • Rumelhart, D.1    Hinton, G.2    Williams, R.3
  • 41
    • 85027459007 scopus 로고    scopus 로고
    • Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis
    • T. Toda, A.W. Black, and K. Tokuda Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis Proceedings of ISCA 2004 SSW5
    • (2004) Proceedings of ISCA , pp. SSW5
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 42
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • T. Toda, A.W. Black, and K. Tokuda Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory IEEE Trans. Audio Speech Lang. Process. 15 2007 2222 2235
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 43
    • 38649140222 scopus 로고    scopus 로고
    • Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
    • T. Toda, A.W. Black, and K. Tokuda Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model Speech Commun. 50 2008 215 227
    • (2008) Speech Commun. , vol.50 , pp. 215-227
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 44
    • 84878390910 scopus 로고    scopus 로고
    • Implementation of computationally efficient real-time voice conversion
    • T. Toda, T. Muramatsu, and H. Banno Implementation of computationally efficient real-time voice conversion Proceedings of INTERSPEECH 2012 94 97
    • (2012) Proceedings of INTERSPEECH , pp. 94-97
    • Toda, T.1    Muramatsu, T.2    Banno, H.3
  • 45
    • 33745217332 scopus 로고    scopus 로고
    • Cross-speaker articulatory position data for phonetic feature prediction
    • A.R. Toth, and A.W. Black Cross-speaker articulatory position data for phonetic feature prediction Proceedings of INTERSPEECH 2005 2973 2976
    • (2005) Proceedings of INTERSPEECH , pp. 2973-2976
    • Toth, A.R.1    Black, A.W.2
  • 48
    • 0037503670 scopus 로고    scopus 로고
    • A multichannel articulatory database and its application for automatic speech recognition
    • A.A. Wrench A multichannel articulatory database and its application for automatic speech recognition Prodeedings of 5th Seminar of Speech Production 2000 305 308
    • (2000) Prodeedings of 5th Seminar of Speech Production , pp. 305-308
    • Wrench, A.A.1
  • 49
    • 84889257121 scopus 로고    scopus 로고
    • An experimental study on speech enhancement based on deep neural networks
    • Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee An experimental study on speech enhancement based on deep neural networks Signal Process. Lett. IEEE 21 2014 65 68
    • (2014) Signal Process. Lett. IEEE , vol.21 , pp. 65-68
    • Xu, Y.1    Du, J.2    Dai, L.-R.3    Lee, C.-H.4
  • 50
    • 84890474484 scopus 로고    scopus 로고
    • Investigation of deep Boltzmann machines for phone recognition
    • Z. You, X. Wang, and B. Xu Investigation of deep Boltzmann machines for phone recognition Proceedings of ICASSP 2013 7600 7603
    • (2013) Proceedings of ICASSP , pp. 7600-7603
    • You, Z.1    Wang, X.2    Xu, B.3
  • 51
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • H. Zen, A. Senior, and M. Schuster Statistical parametric speech synthesis using deep neural networks Proceedings of ICASSP 2013 7962 7966
    • (2013) Proceedings of ICASSP , pp. 7962-7966
    • Zen, H.1    Senior, A.2    Schuster, M.3
  • 52
    • 84867598220 scopus 로고    scopus 로고
    • Resource configurable spoken query detection using deep Boltzmann machines
    • Y. Zhang, R. Salakhutdinov, H.-A. Chang, and J. Glass Resource configurable spoken query detection using deep Boltzmann machines Proceedings of ICASSP 2012 5161 5164
    • (2012) Proceedings of ICASSP , pp. 5161-5164
    • Zhang, Y.1    Salakhutdinov, R.2    Chang, H.-A.3    Glass, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.