SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2011, Pages 4860-4863

A multi-stream ASR framework for BLSTM modeling of conversational speech

Author keywords

Context Modeling; Conversational Speech Recognition; Long Short Term Memory; Recurrent Neural Networks

Indexed keywords

CONTEXT MODELING; CONVERSATIONAL SPEECH RECOGNITION; DATA STREAM; MULTI-STREAM; SHORT TERM MEMORY; TANDEM SYSTEM; TRIPHONES;

BRAIN; RECURRENT NEURAL NETWORKS; SIGNAL PROCESSING; SPEECH COMMUNICATION;

SPEECH RECOGNITION;

EID: 80051637579 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2011.5947444 Document Type: Conference Paper

Times cited : (36)

References (17)

1
- 78651563436
- Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework
- M. Wöllmer, F. Eyben, A. Graves, B. Schuller, and G. Rigoll, "Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework," Cognitive Computation, vol. 2, no. 3, pp. 180-190, 2010.
- (2010) Cognitive Computation , vol.2 , Issue.3 , pp. 180-190
- Wöllmer, M.¹ Eyben, F.² Graves, A.³ Schuller, B.⁴ Rigoll, G.⁵

2
- 54349106040
- Switching linear dynamic systems for noise robust speech recognition
- B. Mesot and D. Barber, "Switching linear dynamic systems for noise robust speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 6, pp. 1850-1858, 2007.
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.6 , pp. 1850-1858
- Mesot, B.¹ Barber, D.²

4
- 34547522358
- An acoustic model based on Kullback-Leibler divergence for posterior features
- G. Aradilla, J. Vepa, and H. Bourlard, "An acoustic model based on Kullback-Leibler divergence for posterior features," in Proc. of ICASSP, Honolulu, HI, 2007, pp. 657-660.
- Proc. of ICASSP, Honolulu, HI, 2007 , pp. 657-660
- Aradilla, G.¹ Vepa, J.² Bourlard, H.³

5
- 51449103447
- Optimizing bottle-neck features for LVCSR
- F. Grezl and P. Fousek, "Optimizing bottle-neck features for LVCSR," in Proc. of ICASSP, Las Vegas, NV, 2008, pp. 4729-4732.
- Proc. of ICASSP, Las Vegas, NV, 2008 , pp. 4729-4732
- Grezl, F.¹ Fousek, P.²

6
- 78049359820
- Spoken term detection with connectionist temporal classification - A novel hybrid CTC-DBN approach
- M. Wöllmer, F. Eyben, B. Schuller, and G. Rigoll, "Spoken term detection with connectionist temporal classification - a novel hybrid CTC-DBN approach," in Proc. of ICASSP, Dallas, Texas, 2010, pp. 5274-5277.
- Proc. of ICASSP, Dallas, Texas, 2010 , pp. 5274-5277
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Rigoll, G.⁴

8
- 33745213373
- Multi-resolution RASTA filtering for TANDEM-based ASR
- H. Hermansky and P. Fousek, "Multi-resolution RASTA filtering for TANDEM-based ASR," in Proc. of European Conf. on Speech Communication and Technology, Lisbon, Portugal, 2008, pp. 361-364.
- Proc. of European Conf. on Speech Communication and Technology, Lisbon, Portugal, 2008 , pp. 361-364
- Hermansky, H.¹ Fousek, P.²

9
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

10
- 27744588611
- Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," Neural Networks, vol. 18, no. 5-6, pp. 602-610, 2005.
- (2005) Neural Networks , vol.18 , Issue.5-6 , pp. 602-610
- Graves, A.¹ Schmidhuber, J.²

11
- 33749251046
- Bidirectional LSTM networks for improved phoneme classification and recognition
- A. Graves, S. Fernandez, and J. Schmidhuber, "Bidirectional LSTM networks for improved phoneme classification and recognition," in Proc. of ICANN, Warsaw, Poland, 2005, pp. 602-610.
- Proc. of ICANN, Warsaw, Poland, 2005 , pp. 602-610
- Graves, A.¹ Fernandez, S.² Schmidhuber, J.³

12
- 70349203870
- Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks
- M. Wöllmer, F. Eyben, J. Keshet, A. Graves, B. Schuller, and G. Rigoll, "Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks," in Proc. of ICASSP, Taipei, Taiwan, 2009.
- Proc. of ICASSP, Taipei, Taiwan, 2009
- Wöllmer, M.¹ Eyben, F.² Keshet, J.³ Graves, A.⁴ Schuller, B.⁵ Rigoll, G.⁶

13
- 38149014113
- An application of recurrent neural networks to discriminative keyword spotting
- S. Fernandez, A. Graves, and J. Schmidhuber, "An application of recurrent neural networks to discriminative keyword spotting," in Proc. of ICANN, Porto, Portugal, 2007, pp. 220-229.
- Proc. of ICANN, Porto, Portugal, 2007 , pp. 220-229
- Fernandez, S.¹ Graves, A.² Schmidhuber, J.³

14
- 79959821052
- Recognition of spontaneous conversational speech using long short-term memory phoneme predictions
- M. Wöllmer, F. Eyben, B. Schuller, and G. Rigoll, "Recognition of spontaneous conversational speech using long short-term memory phoneme predictions," in Proc. of Interspeech, Makuhari, Japan, 2010, pp. 1946-1949.
- Proc. of Interspeech, Makuhari, Japan, 2010 , pp. 1946-1949
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Rigoll, G.⁴

15
- 70349199112
- COSINE - A corpus of multi-party conversational speech in noisy environments
- A. Stupakov, E. Hanusa, J. Bilmes, and D. Fox, "COSINE - a corpus of multi-party conversational speech in noisy environments," in Proc. of ICASSP, Taipei, Taiwan, 2009.
- Proc. of ICASSP, Taipei, Taiwan, 2009
- Stupakov, A.¹ Hanusa, E.² Bilmes, J.³ Fox, D.⁴

16
- 78650977476
- OpenSMILE - The Munich versatile and fast open-source audio feature extractor
- F. Eyben, M. Wöllmer, and B. Schuller, "openSMILE - the Munich versatile and fast open-source audio feature extractor," in Proc. of ACM Multimedia, Firenze, Italy, 2010.
- Proc. of ACM Multimedia, Firenze, Italy, 2010
- Eyben, F.¹ Wöllmer, M.² Schuller, B.³

17
- 77956721304
- Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening
- M. Wöllmer, B. Schuller, F. Eyben, and G. Rigoll, "Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening," IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 867-881, 2010.
- (2010) IEEE Journal of Selected Topics in Signal Processing , vol.4 , Issue.5 , pp. 867-881
- Wöllmer, M.¹ Schuller, B.² Eyben, F.³ Rigoll, G.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.