SCOPUS 정보 검색 플랫폼

2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011, Proceedings

Volumn , Issue , 2011, Pages 36-41

A novel bottleneck-BLSTM front-end for feature-level context modeling in conversational speech recognition

(3) Wöllmer, Martin a Schuller, Björn a Rigoll, Gerhard a

a TECHNICAL UNIVERSITY OF MUNICH (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; CONTEXT MODELING; CONVERSATIONAL SPEECH RECOGNITION; FEATURE GENERATION; FEATURE LEVEL; FEATURE VECTORS; MULTI-LAYER PERCEPTRONS; MULTI-STREAM; NETWORK TRAINING; PHONEME RECOGNITION; RECURRENT NETWORKS; SHORT TERM MEMORY; SPEECH FEATURES;

BRAIN; FEATURE EXTRACTION; PATTERN RECOGNITION SYSTEMS; SPEECH PROCESSING;

SPEECH RECOGNITION;

EID: 84858961864 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ASRU.2011.6163902 Document Type: Conference Paper

Times cited : (11)

References (17)

1
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- Istanbul, Turkey
- H. Hermansky, D. P. W. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems," in Proc. of ICASSP, Istanbul, Turkey, 2000, pp. 1635-1638.
- (2000) Proc. of ICASSP , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.P.W.² Sharma, S.³

2
- 34547548235
- Probabilistic and bottle-neck features for LVCSR of meetings
- F. Grezl, M. Karafiat, K. Stanislav, and J. Cernocky, "Probabilistic and bottle-neck features for LVCSR of meetings," in Proc. of ICASSP, 2007.
- (2007) Proc. of ICASSP
- Grezl, F.¹ Karafiat, M.² Stanislav, K.³ Cernocky, J.⁴

3
- 77949350062
- Robust vocabulary independent keyword spotting with graphical models
- Merano, Italy
- M. Wöllmer, F. Eyben, B. Schuller, and G. Rigoll, "Robust vocabulary independent keyword spotting with graphical models," in Proc. of ASRU, Merano, Italy, 2009, pp. 349-353.
- (2009) Proc. of ASRU , pp. 349-353
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Rigoll, G.⁴

4
- 70349212558
- Phoneme recognition using spectral envelope and modulation frequency features
- Taipei, Taiwan
- S. Thomas, S. Ganapathy, and H. Hermansky, "Phoneme recognition using spectral envelope and modulation frequency features," in Proc. of ICASSP, Taipei, Taiwan, 2009, pp. 4453-4456.
- (2009) Proc. of ICASSP , pp. 4453-4456
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

5
- 27744588611
- Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- DOI 10.1016/j.neunet.2005.06.042, PII S0893608005001206
- A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," Neural Networks, vol. 18, no. 5-6, pp. 602-610, 2005. (Pubitemid 43186580)
- (2005) Neural Networks , vol.18 , Issue.5-6 , pp. 602-610
- Graves, A.¹ Schmidhuber, J.²

6
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. (Pubitemid 127462305)
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

7
- 78651563436
- Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework
- M. Wöllmer, F. Eyben, A. Graves, B. Schuller, and G. Rigoll, "Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework," Cognitive Computation, vol. 2, no. 3, pp. 180-190, 2010.
- (2010) Cognitive Computation , vol.2 , Issue.3 , pp. 180-190
- Wöllmer, M.¹ Eyben, F.² Graves, A.³ Schuller, B.⁴ Rigoll, G.⁵

8
- 79959821052
- Recognition of spontaneous conversational speech using long short-term memory phoneme predictions
- Makuhari, Japan
- M. Wöllmer, F. Eyben, B. Schuller, and G. Rigoll, "Recognition of spontaneous conversational speech using long short-term memory phoneme predictions," in Proc. of Interspeech, Makuhari, Japan, 2010, pp. 1946-1949.
- (2010) Proc. of Interspeech , pp. 1946-1949
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Rigoll, G.⁴

9
- 80051637579
- A multi-stream ASR framework for BLSTM modeling of conversational speech
- Prague, Czech Republic
- -, "A multi-stream ASR framework for BLSTM modeling of conversational speech," in Proc. of ICASSP, Prague, Czech Republic, 2011, pp. 4860-4863.
- (2011) Proc. of ICASSP , pp. 4860-4863
- Wöllmer, M.¹ Eyben, F.² Schuller, B.³ Rigoll, G.⁴

10
- 84865748400
- Feature frame stacking in RNN-based tandem ASR systems - Learned vs. predefined context
- Florence, Italy
- M. Wöllmer, B. Schuller, and G. Rigoll, "Feature frame stacking in RNN-based Tandem ASR systems - learned vs. predefined context," in Proc. of Interspeech, Florence, Italy, 2011.
- (2011) Proc. of Interspeech
- Wöllmer, M.¹ Schuller, B.² Rigoll, G.³

11
- 0041914606
- Gradient flow in recurrent nets: The difficulty of learning long-term dependencies
- S. C. Kremer and J. F. Kolen, Eds. IEEE Press
- S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, "Gradient flow in recurrent nets: the difficulty of learning long-term dependencies," in A Field Guide to Dynamical Recurrent Neural Networks, S. C. Kremer and J. F. Kolen, Eds. IEEE Press, 2001, pp. 1-15.
- (2001) A Field Guide to Dynamical Recurrent Neural Networks , pp. 1-15
- Hochreiter, S.¹ Bengio, Y.² Frasconi, P.³ Schmidhuber, J.⁴

12
- 0031268931
- Bidirectional recurrent neural networks
- PII S1053587X97080550
- M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, vol. 45, pp. 2673-2681, 1997. (Pubitemid 127766336)
- (1997) IEEE Transactions on Signal Processing , vol.45 , Issue.11 , pp. 2673-2681
- Schuster, M.¹ Paliwal, K.K.²

13
- 79959404069
- The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments
- A. Stupakov, E. Hanusa, D. Vijaywargi, D. Fox, and J. Bilmes, "The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments," Computer Speech and Language, vol. 26, no. 1, pp. 52-66, 2011.
- (2011) Computer Speech and Language , vol.26 , Issue.1 , pp. 52-66
- Stupakov, A.¹ Hanusa, E.² Vijaywargi, D.³ Fox, D.⁴ Bilmes, J.⁵

14
- 51449106187
- Columbus, OH, USA: Department of Psychology, Ohio State University (Distributor)
- M. A. Pitt, L. Dilley, K. Johnson, S. Kiesling, W. Raymond, E. Hume, and E. Fosler-Lussier, Buckeye Corpus of Conversational Speech (2nd release). Columbus, OH, USA: Department of Psychology, Ohio State University (Distributor), 2007,
- (2007) Buckeye Corpus of Conversational Speech (2nd Release)
- Pitt, M.A.¹ Dilley, L.² Johnson, K.³ Kiesling, S.⁴ Raymond, W.⁵ Hume, E.⁶ Fosler-Lussier, E.⁷

15
- 84858960416
- [www.buckeyecorpus.osu.edu].

16
- 80051621128
- Localization of non-linguistic events in spontaneous speech by non-negative matrix factorization and long short-term memory
- Prague, Czech Republic
- F. Weninger, B. Schuller, M. Wöllmer, and G. Rigoll, "Localization of non-linguistic events in spontaneous speech by non-negative matrix factorization and Long Short-Term Memory," in Proc. of ICASSP, Prague, Czech Republic, 2011, pp. 5840-5843.
- (2011) Proc. of ICASSP , pp. 5840-5843
- Weninger, F.¹ Schuller, B.² Wöllmer, M.³ Rigoll, G.⁴

17
- 78650977476
- OpenSMILE - The munich versatile and fast open-source audio feature extractor
- Firenze, Italy
- F. Eyben, M. Wöllmer, and B. Schuller, "openSMILE - the Munich versatile and fast open-source audio feature extractor," in Proc. of ACM Multimedia, Firenze, Italy, 2010, pp. 1459-1462.
- (2010) Proc. of ACM Multimedia , pp. 1459-1462
- Eyben, F.¹ Wöllmer, M.² Schuller, B.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.