메뉴 건너뛰기




Volumn 2017-August, Issue , 2017, Pages 3986-3990

Evaluation of a silent speech interface based on magnetic sensing and deep learning for a phonetically rich vocabulary

Author keywords

Articulatory to acoustic mapping; Recurrent neural network; Speech rehabilitation; Speech synthesis

Indexed keywords

DEEP LEARNING; DEEP NEURAL NETWORKS; MAPPING; METADATA; QUALITY CONTROL; RECURRENT NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION; SPEECH RECOGNITION; SPEECH SYNTHESIS;

EID: 85039155335     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: 10.21437/Interspeech.2017-802     Document Type: Conference Paper
Times cited : (18)

References (34)
  • 1
    • 42949175762 scopus 로고    scopus 로고
    • Development of a (silent) speech recognition system for patients following laryngectomy
    • M. J. Fagan, S. R. Ell, J. M. Gilbert, E. Sarrazin, and P. M. Chapman, "Development of a (silent) speech recognition system for patients following laryngectomy," Med. Eng. Phys., vol. 30, no. 4, pp. 419-425, 2008.
    • (2008) Med. Eng. Phys. , vol.30 , Issue.4 , pp. 419-425
    • Fagan, M.J.1    Ell, S.R.2    Gilbert, J.M.3    Sarrazin, E.4    Chapman, P.M.5
  • 2
    • 78449253410 scopus 로고    scopus 로고
    • Isolated word recognition of silent speech using magnetic implants and sensors
    • J. M. Gilbert, S. I. Rybchenko, R. Hofe, S. R. Ell, M. J. Fagan, R. K. Moore, and P. Green, "Isolated word recognition of silent speech using magnetic implants and sensors," Med. Eng. Phys., vol. 32, no. 10, pp. 1189-1197, 2010.
    • (2010) Med. Eng. Phys. , vol.32 , Issue.10 , pp. 1189-1197
    • Gilbert, J.M.1    Rybchenko, S.I.2    Hofe, R.3    Ell, S.R.4    Fagan, M.J.5    Moore, R.K.6    Green, P.7
  • 3
    • 84870292488 scopus 로고    scopus 로고
    • Small-vocabulary speech recognition using a silent speech interface based on magnetic sensing
    • R. Hofe, S. R. Ell, M. J. Fagan, J. M. Gilbert, P. D. Green, R. K. Moore, and S. I. Rybchenko, "Small-vocabulary speech recognition using a silent speech interface based on magnetic sensing," Speech Commun., vol. 55, no. 1, pp. 22-32, 2013.
    • (2013) Speech Commun. , vol.55 , Issue.1 , pp. 22-32
    • Hofe, R.1    Ell, S.R.2    Fagan, M.J.3    Gilbert, J.M.4    Green, P.D.5    Moore, R.K.6    Rybchenko, S.I.7
  • 5
    • 85016157373 scopus 로고    scopus 로고
    • Restoring speech following total removal of the larynx by a learned transformation from sensor data to acoustics
    • J. M. Gilbert, J. A. Gonzalez, L. A. Cheah, S. R. Ell, P. Green, R. K. Moore, and E. Holdsworth, "Restoring speech following total removal of the larynx by a learned transformation from sensor data to acoustics," J. Acoust. Soc. Am., vol. 141, no. 3, pp. EL307- EL313, 2017.
    • (2017) J. Acoust. Soc. Am. , vol.141 , Issue.3 , pp. EL307-EL313
    • Gilbert, J.M.1    Gonzalez, J.A.2    Cheah, L.A.3    Ell, S.R.4    Green, P.5    Moore, R.K.6    Holdsworth, E.7
  • 6
    • 84930630277 scopus 로고    scopus 로고
    • Deep learning
    • May
    • Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, May 2015.
    • (2015) Nature , vol.521 , Issue.7553 , pp. 436-444
    • LeCun, Y.1    Bengio, Y.2    Hinton, G.3
  • 7
    • 84890543083 scopus 로고    scopus 로고
    • Speech recognition with deep recurrent neural networks
    • A. Graves, A.-r. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. ICASSP, 2013, pp. 6645-6649.
    • (2013) Proc. ICASSP , pp. 6645-6649
    • Graves, A.1    Mohamed, A.-R.2    Hinton, G.3
  • 8
    • 84910047819 scopus 로고    scopus 로고
    • TTS synthesis with bidirectional LSTM based recurrent neural networks
    • Y. Fan, Y. Qian, F.-L. Xie, and F. K. Soong, "TTS synthesis with bidirectional LSTM based recurrent neural networks," in Proc. Interspeech, 2014, pp. 1964-1968.
    • (2014) Proc. Interspeech , pp. 1964-1968
    • Fan, Y.1    Qian, Y.2    Xie, F.-L.3    Soong, F.K.4
  • 9
    • 84946045510 scopus 로고    scopus 로고
    • Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
    • H. Zen and H. Sak, "Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis," in Proc. ICASSP, 2015, pp. 4470-4474.
    • (2015) Proc. ICASSP , pp. 4470-4474
    • Zen, H.1    Sak, H.2
  • 10
    • 84946027999 scopus 로고    scopus 로고
    • Voice conversion using deep bidirectional long short-term memory based recurrent neural networks
    • L. Sun, S. Kang, K. Li, and H. Meng, "Voice conversion using deep bidirectional long short-term memory based recurrent neural networks," in Proc. ICASSP, 2015, pp. 4869-4873.
    • (2015) Proc. ICASSP , pp. 4869-4873
    • Sun, L.1    Kang, S.2    Li, K.3    Meng, H.4
  • 13
    • 84949568676 scopus 로고    scopus 로고
    • Data driven articulatory synthesis with deep neural networks
    • S. Aryal and R. Gutierrez-Osuna, "Data driven articulatory synthesis with deep neural networks," Comput. Speech Lang., vol. 36, pp. 260-273, 2016.
    • (2016) Comput. Speech Lang. , vol.36 , pp. 260-273
    • Aryal, S.1    Gutierrez-Osuna, R.2
  • 14
    • 0000877063 scopus 로고
    • Delayed auditory feedback
    • A. J. Yates, "Delayed auditory feedback," Psychological bulletin, vol. 60, no. 3, p. 213, 1963.
    • (1963) Psychological Bulletin , vol.60 , Issue.3 , pp. 213
    • Yates, A.J.1
  • 15
    • 0036096888 scopus 로고    scopus 로고
    • Effect of delayed auditory feedback on normal speakers at two speech rates
    • A. Stuart, J. Kalinowski, M. P. Rastatter, and K. Lynch, "Effect of delayed auditory feedback on normal speakers at two speech rates," J. Acoust. Soc. Am., vol. 111, no. 5, pp. 2237-2241, 2002.
    • (2002) J. Acoust. Soc. Am. , vol.111 , Issue.5 , pp. 2237-2241
    • Stuart, A.1    Kalinowski, J.2    Rastatter, M.P.3    Lynch, K.4
  • 16
    • 0031573117 scopus 로고    scopus 로고
    • Long short-term memory
    • S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    • (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2
  • 18
    • 0025503558 scopus 로고
    • Backpropagation through time: What it does and how to do it
    • P. J. Werbos, "Backpropagation through time: what it does and how to do it," Proceedings of the IEEE, vol. 78, no. 10, pp. 1550- 1560, 1990.
    • (1990) Proceedings of the IEEE , vol.78 , Issue.10 , pp. 1550-1560
    • Werbos, P.J.1
  • 20
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time- frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
    • Apr.
    • H. Kawahara, I. Masuda-Katsuse, and A. De Cheveigne, "Restructuring speech representations using a pitch-adaptive time- frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds," Speech communication, vol. 27, no. 3, pp. 187-207, Apr. 1999.
    • (1999) Speech Communication , vol.27 , Issue.3 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigne, A.3
  • 21
    • 85016140477 scopus 로고
    • An adaptive algorithm for Mel-cepstral analysis of speech
    • T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for Mel-cepstral analysis of speech," in Proc. ICASSP, 1992, pp. 137-140.
    • (1992) Proc. ICASSP , pp. 137-140
    • Fukada, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 22
    • 0027530250 scopus 로고
    • SIMPLS: An alternative approach to partial least squares regression
    • S. De Jong, "SIMPLS: an alternative approach to partial least squares regression," Chemometrics Intell. Lab. Syst., vol. 18, no. 3, pp. 251-263, 1993.
    • (1993) Chemometrics Intell. Lab. Syst. , vol.18 , Issue.3 , pp. 251-263
    • De Jong, S.1
  • 25
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • Nov.
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio Speech Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 26
    • 38649140222 scopus 로고    scopus 로고
    • Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
    • Mar.
    • - "Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model," Speech Commun., vol. 50, no. 3, pp. 215-227, Mar. 2008.
    • (2008) Speech Commun. , vol.50 , Issue.3 , pp. 215-227
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 27
    • 84999828343 scopus 로고    scopus 로고
    • Real-time control of an articulatory-based speech synthesizer for brain computer interfaces
    • F. Bocquelet, T. Hueber, L. Girin, C. Savariaux, and B. Yvert, "Real-time control of an articulatory-based speech synthesizer for brain computer interfaces," PLOS Computational Biology, vol. 12, no. 11, p. e1005119, 2016.
    • (2016) PLOS Computational Biology , vol.12 , Issue.11 , pp. e1005119
    • Bocquelet, F.1    Hueber, T.2    Girin, L.3    Savariaux, C.4    Yvert, B.5
  • 28
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMM-based speech synthesis
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP, 2000, pp. 1315-1318.
    • (2000) Proc. ICASSP , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 29
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • IEEE
    • H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks," in Proc. ICASSP. IEEE, 2013, pp. 7962-7966.
    • (2013) Proc. ICASSP , pp. 7962-7966
    • Zen, H.1    Senior, A.2    Schuster, M.3
  • 30
    • 84946074523 scopus 로고    scopus 로고
    • The effect of neural networks in statistical parametric speech synthesis
    • K. Hashimoto, K. Oura, Y. Nankaku, and K. Tokuda, "The effect of neural networks in statistical parametric speech synthesis," in Proc. ICASSP, 2015, pp. 4455-4459.
    • (2015) Proc. ICASSP , pp. 4455-4459
    • Hashimoto, K.1    Oura, K.2    Nankaku, Y.3    Tokuda, K.4
  • 31
    • 0031268931 scopus 로고    scopus 로고
    • Bidirectional recurrent neural networks
    • M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2673- 2681, 1997.
    • (1997) IEEE Trans. Signal Process. , vol.45 , Issue.11 , pp. 2673-2681
    • Schuster, M.1    Paliwal, K.K.2
  • 32
    • 27744588611 scopus 로고    scopus 로고
    • Framewise phoneme classification with bidirectional LSTM and other neural network architectures
    • A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," Neural Networks, vol. 18, no. 5, pp. 602-610, 2005.
    • (2005) Neural Networks , vol.18 , Issue.5 , pp. 602-610
    • Graves, A.1    Schmidhuber, J.2
  • 34
    • 84910067727 scopus 로고    scopus 로고
    • Analysis of phonetic similarity in a silent speech interface based on permanent magnetic articulography
    • J. A. Gonzalez, L. A. Cheah, J. Bai, S. R. Ell, J. M. Gilbert, R. K. Moore, and P. D. Green, "Analysis of phonetic similarity in a silent speech interface based on permanent magnetic articulography," in Proc. Interspeech, 2014, pp. 1018-1022.
    • (2014) Proc. Interspeech , pp. 1018-1022
    • Gonzalez, J.A.1    Cheah, L.A.2    Bai, J.3    Ell, S.R.4    Gilbert, J.M.5    Moore, R.K.6    Green, P.D.7


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.