SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2011, Pages 1233-1236

Feature frame stacking in RNN-based Tandem ASR systems - Learned vs. predefined context

(3) Wöllmer, Martin a Schuller, Björn a Rigoll, Gerhard a

a TECHNICAL UNIVERSITY OF MUNICH (Germany)

Author keywords

Automatic speech recognition; Context modeling; Long short term memory; Recurrent neural networks

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; AUTOMATIC SPEECH RECOGNITION SYSTEM; CONTEXT MODELING; CONTEXTUAL INFORMATION; EMPIRICAL EVIDENCE; FEATURE LEVEL; FEATURE VECTORS; MULTI-LAYER PERCEPTRONS; MULTI-STREAM; PHONEME RECOGNITION; SHORT TERM MEMORY; TANDEM SYSTEM;

BRAIN; PATTERN RECOGNITION SYSTEMS; PROFITABILITY; RECURRENT NEURAL NETWORKS;

SPEECH RECOGNITION;

EID: 84865748400 PISSN: None EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (11)

References (15)

1
- 24144487688
- Tandem connectionist feature extraction for conversational speech recognition
- Springer
- Q. Zhu, B. Chen, N. Morgan, and A. Stolcke, "Tandem connectionist feature extraction for conversational speech recognition," in Machine Learning for Multimodal Interaction. Springer, 2005, pp. 223-231.
- (2005) Machine Learning for Multimodal Interaction , pp. 223-231
- Zhu, Q.¹ Chen, B.² Morgan, N.³ Stolcke, A.⁴

2
- 51449103447
- Optimizing bottle-neck features for LVCSR
- Las Vegas, NV
- F. Grezl and P. Fousek, "Optimizing bottle-neck features for LVCSR," in Proc. of ICASSP, Las Vegas, NV, 2008, pp. 4729-4732.
- (2008) Proc. of ICASSP , pp. 4729-4732
- Grezl, F.¹ Fousek, P.²

3
- 78651563436
- Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework
- M. Wöllmer, F. Eyben, A. Graves, B. Schuller, and G. Rigoll, "Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework," Cognitive Computation, vol. 2, no. 3, pp. 180-190, 2010.
- (2010) Cognitive Computation , vol.2 , Issue.3 , pp. 180-190
- Wöllmer, M.¹ Eyben, F.² Graves, A.³ Schuller, B.⁴ Rigoll, G.⁵

4
- 70349212558
- Phoneme recognition using spectral envelope and modulation frequency features
- Taipei,Taiwan
- S. Thomas, S. Ganapathy, and H. Hermansky, "Phoneme recognition using spectral envelope and modulation frequency features," in Proc. of ICASSP, Taipei,Taiwan, 2009, pp. 4453-4456.
- (2009) Proc. of ICASSP , pp. 4453-4456
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

5
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

6
- 27744588611
- Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," Neural Networks, vol. 18, no. 5-6, pp. 602-610, 2005.
- (2005) Neural Networks , vol.18 , Issue.5-6 , pp. 602-610
- Graves, A.¹ Schmidhuber, J.²

7
- 38149014113
- An application of recurrent neural networks to discriminative keyword spotting
- Porto, Portugal
- S. Fernandez, A. Graves, and J. Schmidhuber, "An application of recurrent neural networks to discriminative keyword spotting," in Proc. of ICANN, Porto, Portugal, 2007, pp. 220-229.
- (2007) Proc. of ICANN , pp. 220-229
- Fernandez, S.¹ Graves, A.² Schmidhuber, J.³

8
- 80051637579
- A multi-stream ASR framework for BLSTM modeling of conversational speech
- Prague, Czech Republic
- M. Wöllmer, F. Eyben, B. Schuller, and G. Rigoll, "A multi-stream ASR framework for BLSTM modeling of conversational speech," in Proc. of ICASSP, Prague, Czech Republic, 2011.
- (2011) Proc. of ICASSP
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Rigoll, G.⁴

9
- 84865753113
- The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments
- to appear
- A. Stupakov, E. Hanusa, D. Vijaywargi, D. Fox, and J. Bilmes, "The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments," Computer Speech and Language, 2011, to appear.
- (2011) Computer Speech and Language
- Stupakov, A.¹ Hanusa, E.² Vijaywargi, D.³ Fox, D.⁴ Bilmes, J.⁵

10
- 0041914606
- Gradient flow in recurrent nets: The difficulty of learning long-term dependencies
- S. C. Kremer and J. F. Kolen, Eds. IEEE Press
- S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, "Gradient flow in recurrent nets: the difficulty of learning long-term dependencies," in A Field Guide to Dynamical Recurrent Neural Networks, S. C. Kremer and J. F. Kolen, Eds. IEEE Press, 2001, pp. 1-15.
- (2001) A Field Guide to Dynamical Recurrent Neural Networks , pp. 1-15
- Hochreiter, S.¹ Bengio, Y.² Frasconi, P.³ Schmidhuber, J.⁴

11
- 77956721304
- Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening
- M. Wöllmer, B. Schuller, F. Eyben, and G. Rigoll, "Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening," IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 867-881, 2010.
- (2010) IEEE Journal of Selected Topics in Signal Processing , vol.4 , Issue.5 , pp. 867-881
- Wöllmer, M.¹ Schuller, B.² Eyben, F.³ Rigoll, G.⁴

12
- 85161980569
- Unconstrained online handwriting recognition with recurrent neural networks
- A. Graves, S. Fernandez, M. Liwicki, H. Bunke, and J. Schmidhuber, "Unconstrained online handwriting recognition with recurrent neural networks," Advances in Neural Information Processing Systems, vol. 20, pp. 1-8, 2008.
- (2008) Advances in Neural Information Processing Systems , vol.20 , pp. 1-8
- Graves, A.¹ Fernandez, S.² Liwicki, M.³ Bunke, H.⁴ Schmidhuber, J.⁵

13
- 0031268931
- Bidirectional recurrent neural networks
- M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, vol. 45, pp. 2673-2681, 1997.
- (1997) IEEE Transactions on Signal Processing , vol.45 , pp. 2673-2681
- Schuster, M.¹ Paliwal, K.K.²

14
- 78650977476
- OpenSMILE - The Munich versatile and fast open-source audio feature extractor
- Firenze, Italy
- F. Eyben, M. Wöllmer, and B. Schuller, "openSMILE - the Munich versatile and fast open-source audio feature extractor," in Proc. of ACM Multimedia, Firenze, Italy, 2010, pp. 1459-1462.
- (2010) Proc. of ACM Multimedia , pp. 1459-1462
- Eyben, F.¹ Wöllmer, M.² Schuller, B.³

15
- 79959821052
- Recognition of spontaneous conversational speech using long short-term memory phoneme predictions
- Makuhari, Japan
- M. Wöllmer, F. Eyben, B. Schuller, and G. Rigoll, "Recognition of spontaneous conversational speech using long short-term memory phoneme predictions," in Proc. of Interspeech, Makuhari, Japan, 2010, pp. 1946-1949.
- (2010) Proc. of Interspeech , pp. 1946-1949
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Rigoll, G.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.