SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 4450-4454

A deep recurrent approach for acoustic-to-articulatory inversion

(6) Liu, Peng a Yu, Quanjie a Wu, Zhiyong a,b Kang, Shiyin a Meng, Helen a,b Cai, Lianhong a

a TSINGHUA UNIVERSITY (China)

b CHINESE UNIVERSITY OF HONG KONG (Hong Kong)

Author keywords

layer wise pre training; long short term memory (LSTM); mixture density network (MDN); recurrent nueral network (RNN)

Indexed keywords

AUDIO SIGNAL PROCESSING; BRAIN; DEEP NEURAL NETWORKS; MEAN SQUARE ERROR; MIXTURES; SPEECH COMMUNICATION;

ACCURATE PREDICTION; ARTICULATORY INVERSION; CONTEXT INFORMATION; INVERSION ACCURACY; MIXTURE DENSITY; NUERAL NETWORKS; PRE-TRAINING; ROOT MEAN SQUARE ERRORS;

LONG SHORT-TERM MEMORY;

EID: 84946016986 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178812 Document Type: Conference Paper

Times cited : (83)

References (25)

1
- 33846680938
- Speech production knowledge in automatic speech recognition
- S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester, Speech production knowledge in automatic speech recognition, The Journal of the Acoustical Society of America, vol. 121, no. 2, pp. 723-742, 2007
- (2007) The Journal of the Acoustical Society of America , vol.121 , Issue.2 , pp. 723-742
- King, S.¹ Frankel, J.² Livescu, K.³ McDermott, E.⁴ Richmond, K.⁵ Wester, M.⁶

2
- 68149157315
- Integrating articulatory features into HMM-based parametric speech synthesis
- Z. Ling, K. Richmond, J. Yamagishi, and R. Wang, Integrating articulatory features into HMM-based parametric speech synthesis, IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1171-1185, 2009
- (2009) IEEE Transactions on Audio, Speech, and Language Processing , vol.17 , Issue.6 , pp. 1171-1185
- Ling, Z.¹ Richmond, K.² Yamagishi, J.³ Wang, R.⁴

3
- 84890443373
- Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training
- J.H. Zhao, H. Yuan, W.K. Leung, H. Meng, J. Liu, and S.H. Xia, Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training, in Proc. ICASSP, 2013, pp. 8218-8222
- (2013) Proc. ICASSP , pp. 8218-8222
- Zhao, J.H.¹ Yuan, H.² Leung, W.K.³ Meng, H.⁴ Liu, J.⁵ Xia, S.H.⁶

4
- 0038359547
- Modelling the uncertainty in recovering articulation from acoustics
- K. Richmond, S. King, and P. Taylor, Modelling the uncertainty in recovering articulation from acoustics, Computer Speech &Language, vol. 17, no. 2, pp. 153-172, 2003
- (2003) Computer Speech &Language , vol.17 , Issue.2 , pp. 153-172
- Richmond, K.¹ King, S.² Taylor, P.³

5
- 67650153217
- Acoustic-articulatory modeling with the trajectory HMM
- L. Zhang and S. Renals, Acoustic-articulatory modeling with the trajectory HMM, IEEE Signal Processing Letters, vol. 15, pp. 245-248, 2008
- (2008) IEEE Signal Processing Letters , vol.15 , pp. 245-248
- Zhang, L.¹ Renals, S.²

6
- 44949185845
- A trajectory mixture density network for the acoustic-articulatory inversion mapping
- K. Richmond, A trajectory mixture density network for the acoustic-articulatory inversion mapping, in Proc. INTERSPEECH, 2006, pp. 577-580
- (2006) Proc. INTERSPEECH , pp. 577-580
- Richmond, K.¹

7
- 84878403872
- Deep architectures for articulatory inversion
- B. Uria, I. Murray, S. Renals, and K. Richmond, Deep architectures for articulatory inversion, in Proc. INTERSPEECH, 2012, pp. 867-870
- (2012) Proc. INTERSPEECH , pp. 867-870
- Uria, B.¹ Murray, I.² Renals, S.³ Richmond, K.⁴

8
- 0033708106
- Speech parameter generation algorithms for HMMbased speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, Speech parameter generation algorithms for HMMbased speech synthesis, in Proc. ICASSP, 2000, pp. 1315-1318
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

9
- 84890527090
- Multi-distribution deep belief network for speech synthesis
- S. Kang, X. Qian, and H. Meng, Multi-distribution deep belief network for speech synthesis, in Proc. ICASSP, 2013, pp. 8012-8016
- (2013) Proc. ICASSP , pp. 8012-8016
- Kang, S.¹ Qian, X.² Meng, H.³

10
- 84910030421
- Statistical parametric speech synthesis using weighted multi-distribution deep belief network
- S. Kang and H. Meng, Statistical parametric speech synthesis using weighted multi-distribution deep belief network, in Proc. INTERSPEECH, 2014, pp. 1959-1963
- (2014) Proc. INTERSPEECH , pp. 1959-1963
- Kang, S.¹ Meng, H.²

11
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- H. Zen, A. Senior, and M. Schuster, Statistical parametric speech synthesis using deep neural networks, in Proc. ICASSP, 2013, pp. 7962-7966
- (2013) Proc. ICASSP , pp. 7962-7966
- Zen, H.¹ Senior, A.² Schuster, M.³

12
- 84890447002
- Modeling spectral envelopes using restricted boltzmann machines for statistical parametric speech synthesis
- Z. Ling, L. Deng, and D. Yu, Modeling spectral envelopes using restricted boltzmann machines for statistical parametric speech synthesis, in Proc. ICASSP, 2013, pp. 7825-7829
- (2013) Proc. ICASSP , pp. 7825-7829
- Ling, Z.¹ Deng, L.² Yu, D.³

13
- 84910047819
- TTS synthesis with bidirectional LSTM based recurrent neural networks
- Y. Fan, Y. Qian, F. Xie, and F.K. Soong, TTS synthesis with bidirectional LSTM based recurrent neural networks, in Proc. INTERSPEECH, 2014, pp. 1964-1968
- (2014) Proc. INTERSPEECH , pp. 1964-1968
- Fan, Y.¹ Qian, Y.² Xie, F.³ Soong, F.K.⁴

14
- 0031268931
- Bidirectional recurrent neural networks
- M. Schuster and K.K. Paliwal, Bidirectional recurrent neural networks, IEEE Transactions on Signal Processing, pp. 2673-2681, 1997
- (1997) IEEE Transactions on Signal Processing , pp. 2673-2681
- Schuster, M.¹ Paliwal, K.K.²

15
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

16
- 0041965934
- Learning precise timing with LSTM recurrent networks
- F.A. Gers, N.N. Schraudolph, and J. Schmidhuber, Learning precise timing with LSTM recurrent networks, The Journal of Machine Learning Research, vol. 3, pp. 115-143, 2003
- (2003) The Journal of Machine Learning Research , vol.3 , pp. 115-143
- Gers, F.A.¹ Schraudolph, N.N.² Schmidhuber, J.³

17
- 0004113976
- Technical Report, Aston University
- C.M. Bishop, Mixture density networks, Technical Report, Aston University, 1997
- (1997) Mixture Density Networks
- Bishop, C.M.¹

18
- 84946086544
- T. Tieleman and G. Hinton, Lecture 6.5-rmsprop, coursera: Neural networks for machine learning, 2012
- (2012) Lecture 6.5-rmsprop, Coursera: Neural Networks for Machine Learning
- Tieleman, T.¹ Hinton, G.²

19
- 84898931970
- Training and analysing deep recurrent neural networks
- M. Hermans and B. Schrauwen, Training and analysing deep recurrent neural networks, in Advances in Neural Information Processing Systems, 2013, pp. 190-198
- (2013) Advances in Neural Information Processing Systems , pp. 190-198
- Hermans, M.¹ Schrauwen, B.²

20
- 84906979661
- arXiv preprint arXiv:1308.0850
- A. Graves, Generating sequences with recurrent neural networks, arXiv preprint arXiv:1308.0850, 2013
- (2013) Generating Sequences with Recurrent Neural Networks
- Graves, A.¹

21
- 84946058872
- A. Graves, http://sourceforge.net/projects/rnnl
- Graves, A.¹

22
- 84865778430
- Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus
- K. Richmond, P. Hoole, and S. King, Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus, in Proc. INTERSPEECH, 2009, pp. 1505-1508
- (2009) Proc. INTERSPEECH , pp. 1505-1508
- Richmond, K.¹ Hoole, P.² King, S.³

23
- 0026491198
- Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements
- J.S. Perkell, M.H. Cohen, M.A. Svirsky, M.L. Matthies, I. Garabieta, and M.T. Jackson, Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements, The Journal of the Acoustical Society of America, vol. 92, no. 6, pp. 3078-3096, 1992
- (1992) The Journal of the Acoustical Society of America , vol.92 , Issue.6 , pp. 3078-3096
- Perkell, J.S.¹ Cohen, M.H.² Svirsky, M.A.³ Matthies, M.L.⁴ Garabieta, I.⁵ Jackson, M.T.⁶

24
- 0019068177
- Linear prediction on a warped frequency scale
- H.W. Strube, Linear prediction on a warped frequency scale, The Journal of the Acoustical Society of America, vol. 68, no. 4, pp. 1071-1076, 1980
- (1980) The Journal of the Acoustical Society of America , vol.68 , Issue.4 , pp. 1071-1076
- Strube, H.W.¹

25
- 79959822106
- Adaptation of a tongue shape model by local feature transformations
- C. Qin, M.A. Carreira-Perpinán, and M. Farhadloo, Adaptation of a tongue shape model by local feature transformations, in Proc. INTERSPEECH, 2010, pp. 1596-1599
- (2010) Proc. INTERSPEECH , pp. 1596-1599
- Qin, C.¹ Carreira-Perpinán, M.A.² Farhadloo, M.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.