SCOPUS 정보 검색 플랫폼

Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Volumn , Issue , 2010, Pages 1946-1949

Recognition of spontaneous conversational speech using long Short-Term Memory phoneme predictions

(4) Wollmer, Martin a Eyben, Florian a Schuller, Björn a Rigoll, Gerhard a

a TECHNICAL UNIVERSITY OF MUNICH (Germany)

Author keywords

Context modeling; Large vocabulary continuous speech recognition; Long Short Term Memory; Recurrent neural networks

Indexed keywords

BRAIN; CONTINUOUS SPEECH RECOGNITION; FORECASTING; NETWORK ARCHITECTURE; RECURRENT NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION; VOCABULARY CONTROL;

CO-ARTICULATION; CONTEXT MODELING; CONVERSATIONAL SPEECH; LARGE CORPORA; LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION; PREDICTION ERROR RATES; WORD ACCURACIES; WORD RECOGNITION; CONTEXT MODELS; HUMAN SPEECH; MEMORY MODELING; MEMORY NETWORK; SYSTEM USE; TIME-PERIODS; TRIPHONES;

LONG SHORT-TERM MEMORY;

EID: 79959821052 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (19)

References (23)

1
- 0031222490
- MMIE training of large vocabulary recognition systems
- PII S0167639397000290
- V. Valtchev, J. J. Odell, P. C. Woodland, and S. J. Young, "MMIE training of large vocabulary recognition systems," Speech Communication, vol. 22, no. 4, pp. 303-314, 1997. (Pubitemid 127433601)
- (1997) Speech Communication , vol.22 , Issue.4 , pp. 303-314
- Valtchev, V.¹ Odell, J.J.² Woodland, P.C.³ Young, S.J.⁴

2
- 0141591620
- Recent improvements in the CU SONIC ASR system for noisy speech: The spine task
- Hong Kong
- B. Pellom and K. Hacioglu, "Recent improvements in the CU SONIC ASR system for noisy speech: the spine task," in Proc. of ICASSP, Hong Kong, 2003.
- (2003) Proc. of ICASSP
- Pellom, B.¹ Hacioglu, K.²

3
- 48249106592
- Static and dynamic modelling for the recognition of non-verbal vocalisations in conversational speech
- Kloster Irsee, Germany
- B. Schuller, F. Eyben, and G. Rigoll, "Static and dynamic modelling for the recognition of non-verbal vocalisations in conversational speech," in Proc. of PIT, Kloster Irsee, Germany, 2008, pp. 99-110.
- (2008) Proc. of PIT , pp. 99-110
- Schuller, B.¹ Eyben, F.² Rigoll, G.³

4
- 54349106040
- Switching linear dynamic systems for noise robust speech recognition
- B. Mesot and D. Barber, "Switching linear dynamic systems for noise robust speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 6, pp. 1850-1858, 2007.
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.6 , pp. 1850-1858
- Mesot, B.¹ Barber, D.²

5
- 70450180507
- Robust in-car spelling recognition - A tandem BLSTM-HMM approach
- Brighton, UK
- M. Wöllmer, F. Eyben, B. Schuller, Y. Sun, T. Moosmayr, and N. Nguyen-Thien, "Robust in-car spelling recognition - a tandem BLSTM-HMM approach," in Proc. of Interspeech, Brighton, UK, 2009.
- (2009) Proc. of Interspeech
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Sun, Y.⁴ Moosmayr, T.⁵ Nguyen-Thien, N.⁶

6
- 67650135931
- Recognition of noisy speech: A comparative survey of robust model architecture and feature enhancement
- iD 942617
- B. Schuller, M. Wöllmer, T. Moosmayr, and G. Rigoll, "Recognition of noisy speech: A comparative survey of robust model architecture and feature enhancement," Journal on Audio, Speech, and Music Processing, 2009, iD 942617.
- (2009) Journal on Audio, Speech, and Music Processing
- Schuller, B.¹ Wöllmer, M.² Moosmayr, T.³ Rigoll, G.⁴

7
- 0033709098
- Tandem con-nectionist feature extraction for conventional HMM systems
- Istanbul, Turkey
- H. Hermansky, D. P. W. Ellis, and S. Sharma, "Tandem con-nectionist feature extraction for conventional HMM systems," in Proc. of ICASSP, vol. 3, Istanbul, Turkey, 2000, pp. 1635-1638.
- (2000) Proc. of ICASSP , vol.3 , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.P.W.² Sharma, S.³

8
- 70450166492
- Enhanced phone posteriors for improving speech recognition systems
- H. Ketabdar and H. Bourlard, "Enhanced phone posteriors for improving speech recognition systems," in IDIAP-RR, no. 39, 2008.
- (2008) IDIAP-RR , Issue.39
- Ketabdar, H.¹ Bourlard, H.²

9
- 0034848926
- Tandem acoustic modeling in large-vocabulary recognition
- D. P. W. Ellis, R. Singh, and S. Sivadas, "Tandem acoustic modeling in large-vocabulary recognition," in Proc. of ICASSP, Salt Lake City, UT, USA, 2001, pp. 517-520. (Pubitemid 32839300)
- (2001) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 517-520
- Ellis, D.P.W.¹ Singh, R.² Sivadas, S.³

10
- 0041914606
- Gradient flow in recurrent nets: The difficulty of learning long-term dependencies
- S. C. Kremer and J. F. Kolen, Eds. IEEE Press
- S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, "Gradient flow in recurrent nets: the difficulty of learning long-term dependencies," in A Field Guide to Dynamical Recurrent Neural Networks, S. C. Kremer and J. F. Kolen, Eds. IEEE Press, 2001.
- (2001) A Field Guide to Dynamical Recurrent Neural Networks
- Hochreiter, S.¹ Bengio, Y.² Frasconi, P.³ Schmidhuber, J.⁴

11
- 33745213373
- Multi-resolution RASTA filtering for TANDEM-based ASR
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- H. Hermansky and P. Fousek, "Multi-resolution RASTA filtering for TANDEM-based ASR," in Proc. of European Conf. on Speech Communication and Technology, Lisbon, Portugal, 2008, pp. 361-364. (Pubitemid 43908074)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 361-364
- Hermansky, H.¹ Fousek, P.²

12
- 56449109755
- Learning long-term dependencies with recurrent neural networks
- A. M. Schaefer, S. Udluft, and H. G. Zimmermann, "Learning long-term dependencies with recurrent neural networks," Neuro-computing, vol. 71, no. 13-15, pp. 2481-2488, 2008.
- (2008) Neuro-computing , vol.71 , Issue.13-15 , pp. 2481-2488
- Schaefer, A.M.¹ Udluft, S.² Zimmermann, H.G.³

13
- 1842436050
- The echo state approach to analyzing and training recurrent neural networks
- H. Jaeger, "The echo state approach to analyzing and training recurrent neural networks," Bremen: German National Research Center for Information Technology, Tech. Rep., 2001.
- (2001) Bremen: German National Research Center for Information Technology, Tech. Rep.
- Jaeger, H.¹

14
- 0031573117
- Long Short-Term Memory
- S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. (Pubitemid 127462305)
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

15
- 27744588611
- Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- DOI 10.1016/j.neunet.2005.06.042, PII S0893608005001206
- A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," Neural Networks, vol. 18, no. 5-6, pp. 602-610, 2005. (Pubitemid 43186580)
- (2005) Neural Networks , vol.18 , Issue.5-6 , pp. 602-610
- Graves, A.¹ Schmidhuber, J.²

16
- 27744588611
- Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- DOI 10.1016/j.neunet.2005.06.042, PII S0893608005001206
- A. Graves, S. Fernandez, and J. Schmidhuber, "Bidirectional LSTM networks for improved phoneme classification and recognition," in Proc. of ICANN, Warsaw, Poland, 2005, pp. 602-610. (Pubitemid 43186580)
- (2005) Neural Networks , vol.18 , Issue.5-6 , pp. 602-610
- Graves, A.¹ Schmidhuber, J.²

17
- 70349203870
- Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks
- Taipei, Taiwan
- M. Wöllmer, F. Eyben, J. Keshet, A. Graves, B. Schuller, and G. Rigoll, "Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks," in Proc. of ICASSP, Taipei, Taiwan, 2009.
- (2009) Proc. of ICASSP
- Wöllmer, M.¹ Eyben, F.² Keshet, J.³ Graves, A.⁴ Schuller, B.⁵ Rigoll, G.⁶

18
- 77949372271
- A tandem BLSTM-DBN architecture for keyword spotting with enhanced context modeling
- Vic, Spain
- M. Wöllmer, F. Eyben, A. Graves, B. Schuller, and G. Rigoll, "A Tandem BLSTM-DBN architecture for keyword spotting with enhanced context modeling," in Proc. of NOLISP 2009, Vic, Spain, 2009.
- (2009) Proc. of NOLISP 2009
- Wöllmer, M.¹ Eyben, F.² Graves, A.³ Schuller, B.⁴ Rigoll, G.⁵

19
- 38149014113
- An application of recurrent neural networks to discriminative keyword spotting
- Porto, Portugal
- S. Fernandez, A. Graves, and J. Schmidhuber, "An application of recurrent neural networks to discriminative keyword spotting," in Proc. of ICANN, Porto, Portugal, 2007, pp. 220-229.
- (2007) Proc. of ICANN , pp. 220-229
- Fernandez, S.¹ Graves, A.² Schmidhuber, J.³

20
- 78651563436
- Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework
- M. Woellmer, F. Eyben, A. Graves, B. Schuller, and G. Rigoll, "Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework," Cognitive Computation, Special Issue on Non-Linear and Non-Conventional Speech Processing, 2010.
- (2010) Cognitive Computation, Special Issue on Non-linear and Non-conventional Speech Processing
- Woellmer, M.¹ Eyben, F.² Graves, A.³ Schuller, B.⁴ Rigoll, G.⁵

21
- 70349199112
- COSINE - A corpus of multi-party conversational speech in noisy environments
- Taipei, Taiwan
- A. Stupakov, E. Hanusa, J. Bilmes, and D. Fox, "COSINE - a corpus of multi-party conversational speech in noisy environments," in Proc. of ICASSP, Taipei, Taiwan, 2009.
- (2009) Proc. of ICASSP
- Stupakov, A.¹ Hanusa, E.² Bilmes, J.³ Fox, D.⁴

22
- 0031268931
- Bidirectional recurrent neural networks
- PII S1053587X97080550
- M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, vol. 45, pp. 2673-2681, November 1997. (Pubitemid 127766336)
- (1997) IEEE Transactions on Signal Processing , vol.45 , Issue.11 , pp. 2673-2681
- Schuster, M.¹ Paliwal, K.K.²

23
- 77956721304
- Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening
- M. Wöllmer, B. Schuller, F. Eyben, and G. Rigoll, "Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening," IEEE Journal of Selected Topics in Signal Processing, Special Issue on Speech Processing for Natural Interaction with Intelligent Environments, 2010.
- (2010) IEEE Journal of Selected Topics in Signal Processing, Special Issue on Speech Processing for Natural Interaction with Intelligent Environments
- Wöllmer, M.¹ Schuller, B.² Eyben, F.³ Rigoll, G.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.