메뉴 건너뛰기




Volumn 08-12-September-2016, Issue , 2016, Pages 385-389

Segmental recurrent neural networks for end-to-end speech recognition

Author keywords

End to end speech recognition; Recurrent neural networks; Segmental CRF

Indexed keywords

DECODING; FEATURE EXTRACTION; RANDOM PROCESSES; RECURRENT NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION; SPEECH PROCESSING;

EID: 84994242299     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: 10.21437/Interspeech.2016-40     Document Type: Conference Paper
Times cited : (59)

References (29)
  • 1
    • 84858952478 scopus 로고    scopus 로고
    • Don't multiply lightly: Quantifying problems with the acoustic model assumptions in speech recognition
    • IEEE
    • D. Gillick, L. Gillick, and S. Wegmann, "Don't multiply lightly: Quantifying problems with the acoustic model assumptions in speech recognition," in Proc. ASRU. IEEE, 2011, pp. 71-76.
    • (2011) Proc. ASRU , pp. 71-76
    • Gillick, D.1    Gillick, L.2    Wegmann, S.3
  • 4
    • 33745185781 scopus 로고    scopus 로고
    • Hidden conditional random fields for phone classification
    • A. Gunawardana, M. Mahajan, A. Acero, and J. C. Platt, "Hidden conditional random fields for phone classification." in INTERSPEECH, 2005, pp. 1117-1120.
    • (2005) INTERSPEECH , pp. 1117-1120
    • Gunawardana, A.1    Mahajan, M.2    Acero, A.3    Platt, J.C.4
  • 6
    • 84936143793 scopus 로고    scopus 로고
    • Towards end-to-end speech recognition with recurrent neural networks
    • A. Graves and N. Jaitly, "Towards end-to-end speech recognition with recurrent neural networks," in Proc. ICML, 2014, pp. 1764-1772.
    • (2014) Proc. ICML , pp. 1764-1772
    • Graves, A.1    Jaitly, N.2
  • 8
    • 84959112739 scopus 로고    scopus 로고
    • Fast and accurate recurrent neural network acoustic models for speech recognition
    • H. Sak, A. Senior, K. Rao, and F. Beaufays, "Fast and accurate recurrent neural network acoustic models for speech recognition," in Proc. INTERSPEECH, 2015.
    • (2015) Proc. INTERSPEECH
    • Sak, H.1    Senior, A.2    Rao, K.3    Beaufays, F.4
  • 9
    • 84964489732 scopus 로고    scopus 로고
    • EESEN: Endto-end speech recognition using deep RNN models and WFST-based decoding
    • Y. Miao, M. Gowayyed, and F. Metze, "EESEN: Endto-end speech recognition using deep RNN models and WFST-based decoding," in Proc. ASRU, 2015.
    • (2015) Proc. ASRU
    • Miao, Y.1    Gowayyed, M.2    Metze, F.3
  • 10
    • 85083953689 scopus 로고    scopus 로고
    • Neural machine translation by jointly learning to align and translate
    • D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in Proc. ICLR, 2015.
    • (2015) Proc. ICLR
    • Bahdanau, D.1    Cho, K.2    Bengio, Y.3
  • 12
    • 84959173420 scopus 로고    scopus 로고
    • A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition
    • L. Lu, X. Zhang, K. Cho, and S. Renals, "A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition," in Proc. INTERSPEECH, 2015.
    • (2015) Proc. INTERSPEECH
    • Lu, L.1    Zhang, X.2    Cho, K.3    Renals, S.4
  • 16
    • 0142192295 scopus 로고    scopus 로고
    • Conditional random fields: Probabilistic models for segmenting and labeling sequence data
    • J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in Proc. ICML, 2001, pp. 282-289.
    • (2001) Proc. ICML , pp. 282-289
    • Lafferty, J.1    McCallum, A.2    Pereira, F.3
  • 17
    • 80051659716 scopus 로고    scopus 로고
    • Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop
    • G. Zweig, P. Nguyen, D. Van Compernolle, K. Demuynck, L. Atlas, P. Clark et al., "Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop," in Proc. ICASSP. IEEE, 2011, pp. 5044-5047.
    • (2011) Proc. ICASSP. IEEE , pp. 5044-5047
    • Zweig, G.1    Nguyen, P.2    Van Compernolle, D.3    Demuynck, K.4    Atlas, L.5    Clark, P.6
  • 18
    • 84876691724 scopus 로고    scopus 로고
    • Conditional random fields in speech, audio, and language processing
    • E. Fosler-Lussier, Y. He, P. Jyothi, and R. Prabhavalkar, "Conditional random fields in speech, audio, and language processing," Proceedings of the IEEE, vol. 101, no. 5, pp. 1054-1075, 2013.
    • (2013) Proceedings of the IEEE , vol.101 , Issue.5 , pp. 1054-1075
    • Fosler-Lussier, E.1    He, Y.2    Jyothi, P.3    Prabhavalkar, R.4
  • 19
    • 84906282118 scopus 로고    scopus 로고
    • Deep segmental neural networks for speech recognition
    • O. Abdel-Hamid, L. Deng, D. Yu, and H. Jiang, "Deep segmental neural networks for speech recognition." in Proc. INTERSPEECH, 2013, pp. 1849-1853.
    • (2013) Proc. INTERSPEECH , pp. 1849-1853
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3    Jiang, H.4
  • 20
    • 84959175560 scopus 로고    scopus 로고
    • Segmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition
    • Y. He and E. Fosler-Lussier, "Segmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition," in Proc. INTERSPEECH, 2015.
    • (2015) Proc. INTERSPEECH
    • He, Y.1    Fosler-Lussier, E.2
  • 23
    • 0031573117 scopus 로고    scopus 로고
    • Long short-term memory
    • S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    • (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2
  • 26
    • 84867598637 scopus 로고    scopus 로고
    • Classification and recognition with direct segment models
    • IEEE
    • G. Zweig, "Classification and recognition with direct segment models," in Proc. ICASSP. IEEE, 2012, pp. 4161-4164.
    • (2012) Proc. ICASSP , pp. 4161-4164
    • Zweig, G.1
  • 27
    • 84878565391 scopus 로고    scopus 로고
    • Efficient segmental conditional random fields for phone recognition
    • Y. He and E. Fosler-Lussier, "Efficient segmental conditional random fields for phone recognition," in Proc. INTERSPEECH, 2012, pp. 1898-1901.
    • (2012) Proc. INTERSPEECH , pp. 1898-1901
    • He, Y.1    Fosler-Lussier, E.2
  • 28
    • 84964454407 scopus 로고    scopus 로고
    • Discriminative segmental cascades for feature-rich phone recognition
    • H. Tang, W. Wang, K. Gimpel, and K. Livescu, "Discriminative segmental cascades for feature-rich phone recognition," in Proc. ASRU, 2015.
    • (2015) Proc. ASRU
    • Tang, H.1    Wang, W.2    Gimpel, K.3    Livescu, K.4
  • 29
    • 84890543083 scopus 로고    scopus 로고
    • Speech recognition with deep recurrent neural networks
    • IEEE
    • A. Graves, A.-R. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. ICASSP. IEEE, 2013, pp. 6645-6649
    • (2013) Proc. ICASSP , pp. 6645-6649
    • Graves, A.1    Mohamed, A.-R.2    Hinton, G.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.