SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 36, Issue , 2016, Pages 260-273

Data driven articulatory synthesis with deep neural networks

(2) Aryal, Sandesh a Gutierrez Osuna, Ricardo a

a TEXAS A AND M UNIVERSITY (United States)

Author keywords

Articulatory synthesis; Deep learning; Electromagnetic articulography; Gaussian mixture models

Indexed keywords

AERODYNAMICS; COMMUNICATION CHANNELS (INFORMATION THEORY); MAPPING; OBJECT RECOGNITION;

ARTICULATORY SYNTHESIS; CONVENTIONAL APPROACH; DEEP LEARNING; DEEP NEURAL NETWORKS; ELECTROMAGNETIC ARTICULOGRAPHY; GAUSSIAN MIXTURE MODEL; SUBJECTIVE EVALUATIONS; TRAJECTORY OPTIMIZATION;

TRAJECTORIES;

EID: 84949568676 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2015.02.003 Document Type: Article

Times cited : (45)

References (52)

1
- 84973405656
- Deep canonical correlation analysis
- G. Andrew, R. Arora, J. Bilmes, and K. Livescu Deep canonical correlation analysis Proceedings of ICML 2013 1247 1255
- (2013) Proceedings of ICML , pp. 1247-1255
- Andrew, G.¹ Arora, R.² Bilmes, J.³ Livescu, K.⁴

2
- 84890540774
- Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains
- R. Arora, and K. Livescu Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains Proceedings of ICASSP 2013 7135 7139
- (2013) Proceedings of ICASSP , pp. 7135-7139
- Arora, R.¹ Livescu, K.²

3
- 84890528210
- Articulatory inversion and synthesis: Towards articulatory-based modification of speech
- S. Aryal, and R. Gutierrez-Osuna Articulatory inversion and synthesis: towards articulatory-based modification of speech Proceedings of ICASSP 2013 7952 7956
- (2013) Proceedings of ICASSP , pp. 7952-7956
- Aryal, S.¹ Gutierrez-Osuna, R.²

4
- 84905227268
- Accent conversion through cross-speaker articulatory synthesis
- S. Aryal, and R. Gutierrez-Osuna Accent conversion through cross-speaker articulatory synthesis Proceedings of ICASSP 2014 7744 7748
- (2014) Proceedings of ICASSP , pp. 7744-7748
- Aryal, S.¹ Gutierrez-Osuna, R.²

5
- 84876477729
- Investigation on dimensionality reduction of concatenated features with deep neural network for LVCSR systems
- Y. Bao, H. Jiang, C. Liu, Y. Hu, and L. Dai Investigation on dimensionality reduction of concatenated features with deep neural network for LVCSR systems Proceedings of ICSP 2012 562 566
- (2012) Proceedings of ICSP , pp. 562-566
- Bao, Y.¹ Jiang, H.² Liu, C.³ Hu, Y.⁴ Dai, L.⁵

6
- 33947682260
- Construction and control of a three-dimensional vocal tract model
- P. Birkholz, D. Jackèl, and B. Kroger Construction and control of a three-dimensional vocal tract model Proceedings of ICASSP 2006 873 876
- (2006) Proceedings of ICASSP , pp. 873-876
- Birkholz, P.¹ Jackèl, D.² Kroger, B.³

7
- 84942611745
- Articulatory synthesis from underlying dynamics
- C.P. Browman, L. Goldstein, J.A.S. Kelso, P. Rubin, and E. Saltzman Articulatory synthesis from underlying dynamics J. Acoust. Soc. Am. 75 1984 S22 S23
- (1984) J. Acoust. Soc. Am. , vol.75 , pp. S22-S23
- Browman, C.P.¹ Goldstein, L.² Kelso, J.A.S.³ Rubin, P.⁴ Saltzman, E.⁵

8
- 84893629649
- arxiv:1301.3468
- K. Cho Boltzmann machines and denoising autoencoders for image denoising 2013 arXiv:1301.3468
- (2013) Boltzmann Machines and Denoising Autoencoders for Image Denoising
- Cho, K.¹

9
- 84949624370
- K.H. Cho Matlab code for restricted/deep Boltzmann machines and autoencoders 2013 https://github.com/kyunghyuncho/deepmat
- (2013) Matlab Code for Restricted/deep Boltzmann Machines and Autoencoders
- Cho, K.H.¹

10
- 84893549229
- Gaussian-Bernoulli deep Boltzmann machine
- K.H. Cho, T. Raiko, and A. Ilin Gaussian-Bernoulli deep Boltzmann machine Proceedings of IJCNN 2013 1 7
- (2013) Proceedings of IJCNN , pp. 1-7
- Cho, K.H.¹ Raiko, T.² Ilin, A.³

11
- 76849116340
- Silent speech interfaces
- B. Denby, T. Schultz, K. Honda, T. Hueber, J.M. Gilbert, and J.S. Brumberg Silent speech interfaces Speech Commun. 52 2010 270 287
- (2010) Speech Commun. , vol.52 , pp. 270-287
- Denby, B.¹ Schultz, T.² Honda, K.³ Hueber, T.⁴ Gilbert, J.M.⁵ Brumberg, J.S.⁶

12
- 77949522811
- Why does unsupervised pre-training help deep learning?
- D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, and S. Bengio Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11 2010 625 660
- (2010) J. Mach. Learn. Res. , vol.11 , pp. 625-660
- Erhan, D.¹ Bengio, Y.² Courville, A.³ Manzagol, P.-A.⁴ Vincent, P.⁵ Bengio, S.⁶

13
- 84905265670
- Normalization of articulatory data through Procrustes transformations and analysis-by-synthesis
- D. Felps, S. Aryal, and R. Gutierrez-Osuna Normalization of articulatory data through Procrustes transformations and analysis-by-synthesis Proceedings of ICASSP 2014 3051 3055
- (2014) Proceedings of ICASSP , pp. 3051-3055
- Felps, D.¹ Aryal, S.² Gutierrez-Osuna, R.³

14
- 84865392230
- Foreign accent conversion through concatenative synthesis in the articulatory domain
- D. Felps, C. Geng, and R. Gutierrez-Osuna Foreign accent conversion through concatenative synthesis in the articulatory domain IEEE Trans. Audio Speech Lang. Process. 20 2012 2301 2312
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , pp. 2301-2312
- Felps, D.¹ Geng, C.² Gutierrez-Osuna, R.³

15
- 65549151799
- How to stretch and shrink vowel systems: Results from a vowel normalization procedure
- C. Geng, and C. Mooshammer How to stretch and shrink vowel systems: results from a vowel normalization procedure J. Acoust. Soc. Am. 125 2009 3278
- (2009) J. Acoust. Soc. Am. , vol.125 , pp. 3278
- Geng, C.¹ Mooshammer, C.²

16
- 80051658443
- A subject-independent acoustic-to-articulatory inversion
- P.K. Ghosh, and S.S. Narayanan A subject-independent acoustic-to-articulatory inversion Proceedings of ICASSP 2011 4624 4627
- (2011) Proceedings of ICASSP , pp. 4624-4627
- Ghosh, P.K.¹ Narayanan, S.S.²

17
- 0024879199
- The effective second formant F2′ and the vocal tract front-cavity
- H. Hermansky, and D.J. Broad The effective second formant F2′ and the vocal tract front-cavity Proceedings of ICASSP 1989 480 483
- (1989) Proceedings of ICASSP , pp. 480-483
- Hermansky, H.¹ Broad, D.J.²

18
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- G. Hinton, L. Deng, D. Yu, G.E. Dahl, A.R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, and T.N. Sainath Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups IEEE Signal Process. Mag. 29 2012 82 97
- (2012) IEEE Signal Process. Mag. , vol.29 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

19
- 84872506495
- A practical guide to training restricted Boltzmann machines
- Springer
- G.E. Hinton A practical guide to training restricted Boltzmann machines Neural Networks: Tricks of the Trade 2012 Springer 599 619
- (2012) Neural Networks: Tricks of the Trade , pp. 599-619
- Hinton, G.E.¹

20
- 2142659020
- Estimation of articulatory movements from speech acoustics using an HMM-based speech production model
- S. Hiroya, and M. Honda Estimation of articulatory movements from speech acoustics using an HMM-based speech production model IEEE Trans. Speech Audio Process. 12 2004 175 185
- (2004) IEEE Trans. Speech Audio Process. , vol.12 , pp. 175-185
- Hiroya, S.¹ Honda, M.²

21
- 84905266520
- Deep Boltzmann machines based vehicle recognition
- A. Hu, H. Li, F. Zhang, and W. Zhang Deep Boltzmann machines based vehicle recognition Proceedings of CCDC 2014 3033 3038
- (2014) Proceedings of CCDC , pp. 3033-3038
- Hu, A.¹ Li, H.² Zhang, F.³ Zhang, W.⁴

22
- 33748982592
- ITU-T Recommendation G.114: One-way transmission time 2003
- (2003) Recommendation G.114: One-way Transmission Time
- ITU-T¹

23
- 84905263404
- The electromagnetic articulography Mandarin accented english (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data
- A. Ji, J. Berry, and M.T. Johnson The electromagnetic articulography mandarin accented english (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data Proceedings of ICASSP 2014 7769 7773
- (2014) Proceedings of ICASSP , pp. 7769-7773
- Ji, A.¹ Berry, J.² Johnson, M.T.³

24
- 0002728725
- Determination of the vocal tract spectrum from the articulatory movements based on the search of an articulatory-acoustic database
- T. Kaburagi, and M. Honda Determination of the vocal tract spectrum from the articulatory movements based on the search of an articulatory-acoustic database Proceedings of ICSLP 1998 433 436
- (1998) Proceedings of ICSLP , pp. 433-436
- Kaburagi, T.¹ Honda, M.²

25
- 0030677481
- Speech representation and transformation using adaptive interpolation of weighted spectrum: Vocoder revisited
- H. Kawahara Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited Proceedings of ICASSP 1997 1303 1306
- (1997) Proceedings of ICASSP , pp. 1303-1306
- Kawahara, H.¹

26
- 6344254321
- A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters
- C.T. Kello, and D.C. Plaut A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters J. Acoust. Soc. Am. 116 2004 2354 2364
- (2004) J. Acoust. Soc. Am. , vol.116 , pp. 2354-2364
- Kello, C.T.¹ Plaut, D.C.²

27
- 0001792343
- Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal tract shapes using an articulatory model
- W. Hardcastle, A. Marchal, Kluwer Academic Publisher Amsterdam
- S. Maeda Compensatory articulation during speech: evidence from the analysis and synthesis of vocal tract shapes using an articulatory model W. Hardcastle, A. Marchal, Speech Production and Speech Modelling 1990 Kluwer Academic Publisher Amsterdam 131 149
- (1990) Speech Production and Speech Modelling , pp. 131-149
- Maeda, S.¹

28
- 0015613574
- Articulatory model for the study of speech production
- P. Mermelstein Articulatory model for the study of speech production J. Acoust. Soc. Am. 53 1973 1070 1082
- (1973) J. Acoust. Soc. Am. , vol.53 , pp. 1070-1082
- Mermelstein, P.¹

29
- 84867211725
- Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- T. Muramatsu, Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory Proceedings of INTERSPEECH 2008 1076 1079
- (2008) Proceedings of INTERSPEECH , pp. 1076-1079
- Muramatsu, T.¹ Ohtani, Y.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

30
- 0003410452
- Springer
- I.T. Nabney NETLAB: Algorithms for Pattern Recognition 2002 Springer
- (2002) NETLAB: Algorithms for Pattern Recognition
- Nabney, I.T.¹

31
- 42649146508
- On the use of phonetic information for mapping from articulatory movements to vocal tract spectrum
- I-I
- K. Nakamura, T. Toda, Y. Nankaku, and K. Tokuda On the use of phonetic information for mapping from articulatory movements to vocal tract spectrum Proceedings of ICASSP 2006 I-I
- (2006) Proceedings of ICASSP
- Nakamura, K.¹ Toda, T.² Nankaku, Y.³ Tokuda, K.⁴

32
- 84906280857
- Voice conversion in high-order eigen space using deep belief nets
- T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki Voice conversion in high-order eigen space using deep belief nets Proceedings of INTERSPEECH 2013 369 372
- (2013) Proceedings of INTERSPEECH , pp. 369-372
- Nakashika, T.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

33
- 84865738977
- A multimodal real-time MRI articulatory corpus for speech research
- S. Narayanan, E. Bresch, P.K. Ghosh, L. Goldstein, A. Katsamanis, Y. Kim, A.C. Lammert, M.I. Proctor, V. Ramanarayanan, and Y. Zhu A multimodal real-time MRI articulatory corpus for speech research Proceedings of INTERSPEECH 2011 837 840
- (2011) Proceedings of INTERSPEECH , pp. 837-840
- Narayanan, S.¹ Bresch, E.² Ghosh, P.K.³ Goldstein, L.⁴ Katsamanis, A.⁵ Kim, Y.⁶ Lammert, A.C.⁷ Proctor, M.I.⁸ Ramanarayanan, V.⁹ Zhu, Y.¹⁰

34
- 70450216609
- Formant trajectories for acoustic-to-articulatory inversion
- I.Y. Özbek, M. Hasegawa-Johnson, and M. Demirekler Formant trajectories for acoustic-to-articulatory inversion Proceedings of INTERSPEECH 2009 2807 2810
- (2009) Proceedings of INTERSPEECH , pp. 2807-2810
- Özbek, I.Y.¹ Hasegawa-Johnson, M.² Demirekler, M.³

35
- 84858966477
- A factored conditional random field model for articulatory feature forced transcription
- R. Prabhavalkar, E. Fosler-Lussier, and K. Livescu A factored conditional random field model for articulatory feature forced transcription Proceedings of ASRU 2011 77 82
- (2011) Proceedings of ASRU , pp. 77-82
- Prabhavalkar, R.¹ Fosler-Lussier, E.² Livescu, K.³

36
- 56149109023
- An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping
- C. Qin, and M.A. Carreira-Perpinán An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping Proceedings of INTERSPEECH 2007 2300 2303
- (2007) Proceedings of INTERSPEECH , pp. 2300-2303
- Qin, C.¹ Carreira-Perpinán, M.A.²

37
- 0038359547
- Modelling the uncertainty in recovering articulation from acoustics
- K. Richmond, S. King, and P. Taylor Modelling the uncertainty in recovering articulation from acoustics Comput. Speech Lang. 17 2003 153 172
- (2003) Comput. Speech Lang. , vol.17 , pp. 153-172
- Richmond, K.¹ King, S.² Taylor, P.³

38
- 84866095936
- Adaptive kernel canonical correlation analysis for estimation of task dynamics from acoustics
- F. Rudzicz Adaptive kernel canonical correlation analysis for estimation of task dynamics from acoustics Proceedings of ICASSP 2010 4198 4201
- (2010) Proceedings of ICASSP , pp. 4198-4201
- Rudzicz, F.¹

39
- 0022471098
- Learning representations by back-propagating errors
- D. Rumelhart, G. Hinton, and R. Williams Learning representations by back-propagating errors Nature 323 1986 533 536
- (1986) Nature , vol.323 , pp. 533-536
- Rumelhart, D.¹ Hinton, G.² Williams, R.³

40
- 84862286946
- Deep Boltzmann machines
- Florida, USA
- R. Salakhutdinov, and G.E. Hinton Deep Boltzmann machines Proceedings of AISTATS Florida, USA 2009 448 455
- (2009) Proceedings of AISTATS , pp. 448-455
- Salakhutdinov, R.¹ Hinton, G.E.²

41
- 85027459007
- Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis
- T. Toda, A.W. Black, and K. Tokuda Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis Proceedings of ISCA 2004 SSW5
- (2004) Proceedings of ISCA , pp. SSW5
- Toda, T.¹ Black, A.W.² Tokuda, K.³

42
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A.W. Black, and K. Tokuda Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory IEEE Trans. Audio Speech Lang. Process. 15 2007 2222 2235
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

43
- 38649140222
- Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
- T. Toda, A.W. Black, and K. Tokuda Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model Speech Commun. 50 2008 215 227
- (2008) Speech Commun. , vol.50 , pp. 215-227
- Toda, T.¹ Black, A.W.² Tokuda, K.³

44
- 84878390910
- Implementation of computationally efficient real-time voice conversion
- T. Toda, T. Muramatsu, and H. Banno Implementation of computationally efficient real-time voice conversion Proceedings of INTERSPEECH 2012 94 97
- (2012) Proceedings of INTERSPEECH , pp. 94-97
- Toda, T.¹ Muramatsu, T.² Banno, H.³

45
- 33745217332
- Cross-speaker articulatory position data for phonetic feature prediction
- A.R. Toth, and A.W. Black Cross-speaker articulatory position data for phonetic feature prediction Proceedings of INTERSPEECH 2005 2973 2976
- (2005) Proceedings of INTERSPEECH , pp. 2973-2976
- Toth, A.R.¹ Black, A.W.²

46
- 84878403872
- Deep architectures for articulatory inversion
- B. Uria, I. Murray, S. Renals, and K. Richmond Deep architectures for articulatory inversion Proceedings of INTERSPEECH 2012 867 870
- (2012) Proceedings of INTERSPEECH , pp. 867-870
- Uria, B.¹ Murray, I.² Renals, S.³ Richmond, K.⁴

47
- 0003652255
- Waisman Center on Mental Retardation & Human Development, University of Wisconsin Madison, WI
- J.R. Westbury X-ray Microbeam Speech Production Database User's Handbook Version 1.0 1994 Waisman Center on Mental Retardation & Human Development, University of Wisconsin Madison, WI
- (1994) X-ray Microbeam Speech Production Database User's Handbook Version 1.0
- Westbury, J.R.¹

48
- 0037503670
- A multichannel articulatory database and its application for automatic speech recognition
- A.A. Wrench A multichannel articulatory database and its application for automatic speech recognition Prodeedings of 5th Seminar of Speech Production 2000 305 308
- (2000) Prodeedings of 5th Seminar of Speech Production , pp. 305-308
- Wrench, A.A.¹

49
- 84889257121
- An experimental study on speech enhancement based on deep neural networks
- Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee An experimental study on speech enhancement based on deep neural networks Signal Process. Lett. IEEE 21 2014 65 68
- (2014) Signal Process. Lett. IEEE , vol.21 , pp. 65-68
- Xu, Y.¹ Du, J.² Dai, L.-R.³ Lee, C.-H.⁴

50
- 84890474484
- Investigation of deep Boltzmann machines for phone recognition
- Z. You, X. Wang, and B. Xu Investigation of deep Boltzmann machines for phone recognition Proceedings of ICASSP 2013 7600 7603
- (2013) Proceedings of ICASSP , pp. 7600-7603
- You, Z.¹ Wang, X.² Xu, B.³

51
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- H. Zen, A. Senior, and M. Schuster Statistical parametric speech synthesis using deep neural networks Proceedings of ICASSP 2013 7962 7966
- (2013) Proceedings of ICASSP , pp. 7962-7966
- Zen, H.¹ Senior, A.² Schuster, M.³

52
- 84867598220
- Resource configurable spoken query detection using deep Boltzmann machines
- Y. Zhang, R. Salakhutdinov, H.-A. Chang, and J. Glass Resource configurable spoken query detection using deep Boltzmann machines Proceedings of ICASSP 2012 5161 5164
- (2012) Proceedings of ICASSP , pp. 5161-5164
- Zhang, Y.¹ Salakhutdinov, R.² Chang, H.-A.³ Glass, J.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.