SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 3709-3713

Single-channel speech separation with memory-enhanced recurrent neural networks

(3) Weninger, Felix a Eyben, Florian a Schuller, Bjorn a,b

a TECHNICAL UNIVERSITY OF MUNICH (Germany)

b IMPERIAL COLLEGE LONDON (United Kingdom)

Author keywords

Long Short Term Memory; recurrent neural networks; Speech enhancement; speech separation

Indexed keywords

AUDIO ACOUSTICS; BRAIN; IMPULSE RESPONSE; NOISE ABATEMENT; RECURRENT NEURAL NETWORKS; SIGNAL PROCESSING; SIGNAL TO NOISE RATIO; SPEECH ANALYSIS; SPEECH ENHANCEMENT; SPEECH RECOGNITION;

LONG SHORT-TERM MEMORY; LOW SIGNAL-TO-NOISE RATIO; NOISE FEATURES; NON-STATIONARY SOURCES; ROOM IMPULSE RESPONSE; SINGLE-CHANNEL SPEECH SEPARATIONS; SPECTRAL SUBTRACTIONS; SPEECH SEPARATION;

ACOUSTIC NOISE;

EID: 84905284062 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854294 Document Type: Conference Paper

Times cited : (136)

References (26)

1
- 84890543083
- Speech recognition with deep recurrent neural networks
- Vancouver, Canada, May,IEEE
- A. Graves, A. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. of ICASSP, Vancouver, Canada, May 2013, pp. 6645-6649, IEEE.
- (2013) Proc. of ICASSP , pp. 6645-6649
- Graves, A.¹ Mohamed, A.² Hinton, G.³

2
- 84867593805
- Polyphonic piano note transcription with recurrent neural networks
- Kyoto, Japan
- S. Böck and M. Schedl, "Polyphonic piano note transcription with recurrent neural networks," in Proc. of ICASSP, Kyoto, Japan, 2012, pp. 121-124.
- (2012) Proc. of ICASSP , pp. 121-124
- Böck, S.¹ Schedl, M.²

3
- 84893675434
- The TUM+TUT+KUL approach to the CHiME Challenge 2013: Multi-stream ASR exploiting BLSTM networks and sparse NMF
- Vancouver, Canada, IEEE
- J.T. Geiger, F. Weninger, A. Hurmalainen, J.F. Gemmeke, M. Wöllmer, B. Schuller, G. Rigoll, and T. Virtanen, "The TUM+TUT+KUL approach to the CHiME Challenge 2013: Multi-stream ASR exploiting BLSTM networks and sparse NMF," in Proc. of 2nd CHiME Workshop held in conjunction with ICASSP 2013, Vancouver, Canada, 2013, pp. 25-30, IEEE.
- (2013) Proc. of 2nd CHiME Workshop Held in Conjunction with ICASSP 2013 , pp. 25-30
- Geiger, J.T.¹ Weninger, F.² Hurmalainen, A.³ Gemmeke, J.F.⁴ Wöllmer, M.⁵ Schuller, B.⁶ Rigoll, G.⁷ Virtanen, T.⁸

4
- 84900542109
- Recurrent neural network feature enhancement: The 2nd CHiME challenge
- Vancouver, Canada, June, IEEE
- A.L. Maas, T.M. O'Neil, A.Y. Hannun, and A.Y. Ng, "Recurrent neural network feature enhancement: The 2nd CHiME challenge," in Proceedings The 2nd CHiME Workshop on Machine Listening in Multisource Environments held in conjunction with ICASSP 2013, Vancouver, Canada, June 2013, pp. 79-80, IEEE.
- (2013) Proceedings the 2nd CHiME Workshop on Machine Listening in Multisource Environments Held in Conjunction with ICASSP 2013 , pp. 79-80
- Maas, A.L.¹ O'neil, T.M.² Hannun, A.Y.³ Ng, A.Y.⁴

5
- 84890489927
- Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly non-stationary noise
- Vancouver, Canada
- M. Wöllmer, Z. Zhang, F. Weninger, B. Schuller, and G. Rigoll, "Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly non-stationary noise," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 6822-6826.
- (2013) Proc. of ICASSP , pp. 6822-6826
- Wöllmer, M.¹ Zhang, Z.² Weninger, F.³ Schuller, B.⁴ Rigoll, G.⁵

6
- 56449089103
- Extracting and composing robust features with denoising autoencoders
- P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. of ICML, 2008, pp. 1096-1103.
- (2008) Proc. of ICML , pp. 1096-1103
- Vincent, P.¹ Larochelle, H.² Bengio, Y.³ Manzagol, P.⁴

7
- 84906262433
- Speech enhancement based on deep denoising autoencoder
- Lyon, France
- X. Lu, Y. Tsao, S. Matsuda, and C. Hori, "Speech enhancement based on deep denoising autoencoder," in Proc. of INTERSPEECH, Lyon, France, 2013, pp. 3444-3448.
- (2013) Proc. of INTERSPEECH , pp. 3444-3448
- Lu, X.¹ Tsao, Y.² Matsuda, S.³ Hori, C.⁴

8
- 84906279378
- Speech enhancement with weighted denoising auto-encoder
- Lyon, France
- B.Y. Xia and C.C. Bao, "Speech enhancement with weighted denoising auto-encoder," in Proc. of INTERSPEECH, Lyon, France, 2013, pp. 436-440.
- (2013) Proc. of INTERSPEECH , pp. 436-440
- Xia, B.Y.¹ Bao, C.C.²

9
- 0035396555
- Noise power spectral density estimation based on optimal smoothing and minimum statistics
- July
- R. Martin, "Noise Power Spectral Density Estimation Based on Optimal Smoothing and Minimum Statistics," IEEE Transactions on Audio, Speech, and Language Processing, vol. 9, no. 5, pp. 504-512, July 2001.
- (2001) IEEE Transactions on Audio, Speech, and Language Processing , vol.9 , Issue.5 , pp. 504-512
- Martin, R.¹

10
- 0031139249
- A class of neural networks for independent component analysis
- J. Karhunen, E. Oja, L.Wang, R. Vigario, and J. Joutsensalo, "A class of neural networks for independent component analysis," IEEE Transactions on Neural Networks, vol. 8, no. 3, pp. 486-504, 1997.
- (1997) IEEE Transactions on Neural Networks , vol.8 , Issue.3 , pp. 486-504
- Karhunen, J.¹ Oja, E.² Wang, L.³ Vigario, R.⁴ Joutsensalo, J.⁵

11
- 0035112152
- Nonlinear blind source separation using a radial basis function network
- Y. Tan, J.Wang, and J.M. Zurada, "Nonlinear blind source separation using a radial basis function network," IEEE Transactions on Neural Networks, vol. 12, no. 1, pp. 124-134, 2001.
- (2001) IEEE Transactions on Neural Networks , vol.12 , Issue.1 , pp. 124-134
- Tan, Y.¹ Wang, J.² Zurada, J.M.³

12
- 84900537286
- The munich feature enhancement approach to the 2013 chime challenge using BLSTM recurrent neural networks
- Vancouver, Canada,IEEE
- F. Weninger, J. Geiger, M. Wöllmer, B. Schuller, and G. Rigoll, "The Munich Feature Enhancement Approach to the 2013 CHiME Challenge Using BLSTM Recurrent Neural Networks," in Proceedings The 2nd CHiME Workshop on Machine Listening in Multisource Environments held in conjunction with ICASSP 2013, Vancouver, Canada, 2013, pp. 86-90, IEEE.
- (2013) Proceedings the 2nd CHiME Workshop on Machine Listening in Multisource Environments Held in Conjunction with ICASSP 2013 , pp. 86-90
- Weninger, F.¹ Geiger, J.² Wöllmer, M.³ Schuller, B.⁴ Rigoll, G.⁵

13
- 84890493989
- Ideal ratio mask estimation using deep neural networks for robust speech recognition
- Vancouver, Canada
- A. Narayanan and D. Wang, "Ideal ratio mask estimation using deep neural networks for robust speech recognition," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 7092-7096.
- (2013) Proc. of ICASSP , pp. 7092-7096
- Narayanan, A.¹ Wang, D.²

14
- 84890489552
- Integrating noise estimation and factorization-based speech separation: A novel hybrid approach
- Vancouver, Canada, IEEE
- C. Joder, F. Weninger, D. Virette, and B. Schuller, "Integrating Noise Estimation and Factorization-based Speech Separation: A Novel Hybrid Approach," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 131-135, IEEE.
- (2013) Proc. of ICASSP , pp. 131-135
- Joder, C.¹ Weninger, F.² Virette, D.³ Schuller, B.⁴

15
- 77950116181
- Factorial scaled hidden Markov model for polyphonic audio representation and source separation
- Mohonk, NY, USA
- A. Ozerov, C. Févotte, and M. Charbit, "Factorial scaled hidden Markov model for polyphonic audio representation and source separation," in Proc. of WASPAA, Mohonk, NY, USA, 2009, pp. 121-124.
- (2009) Proc. of WASPAA , pp. 121-124
- Ozerov, A.¹ Févotte, C.² Charbit, M.³

16
- 80051625972
- A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics
- Prague, Czech Republic
- G.J. Mysore and P. Smaragdis, "A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics," in Proc. of ICASSP, Prague, Czech Republic, 2011, pp. 17-20.
- (2011) Proc. of ICASSP , pp. 17-20
- Mysore, G.J.¹ Smaragdis, P.²

17
- 0034293152
- Learning to forget: Continual prediction with LSTM
- F. Gers, J. Schmidhuber, and F. Cummins, "Learning to forget: Continual prediction with LSTM," Neural Computation, vol. 12, no. 10, pp. 2451-2471, 2000.
- (2000) Neural Computation , vol.12 , Issue.10 , pp. 2451-2471
- Gers, F.¹ Schmidhuber, J.² Cummins, F.³

18
- 70349292240
- Being Bored recognising natural interest by extensive audiovisual integration for real-life application
- B. Schuller, R. Müller, F. Eyben, J. Gast, B. Hörnler, M. Wöllmer, G. Rigoll, A. Höthker, and H. Konosu, "Being Bored recognising natural interest by extensive audiovisual integration for real-life application," Image and Vision Computing, Special Issue on Visual and Multimodal Analysis of Human Spontaneous Behavior, vol. 27, no. 12, pp. 1760-1774, 2009.
- (2009) Image and Vision Computing, Special Issue on Visual and Multimodal Analysis of Human Spontaneous Behavior , vol.27 , Issue.12 , pp. 1760-1774
- Schuller, B.¹ Müller, R.² Eyben, F.³ Gast, J.⁴ Hörnler, B.⁵ Wöllmer, M.⁶ Rigoll, G.⁷ Höthker, A.⁸ Konosu, H.⁹

19
- 84903770147
- Affect recognition in real-life acoustic conditions-A new perspective on feature selection
- Lyon, France, ISCA
- F. Eyben, F. Weninger, and B. Schuller, "Affect recognition in real-life acoustic conditions-A new perspective on feature selection," in Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, 2013, pp. 2044-2048, ISCA.
- (2013) Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association , pp. 2044-2048
- Eyben, F.¹ Weninger, F.² Schuller, B.³

20
- 79954999224
- The interspeech 2010 paralinguistic challenge
- Makuhari, Japan, ISCA
- B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller, and S. Narayanan, "The INTERSPEECH 2010 Paralinguistic Challenge," in Proc. of INTERSPEECH, Makuhari, Japan, 2010, pp. 2794-2797, ISCA.
- (2010) Proc. of INTERSPEECH , pp. 2794-2797
- Schuller, B.¹ Steidl, S.² Batliner, A.³ Burkhardt, F.⁴ Devillers, L.⁵ Müller, C.⁶ Narayanan, S.⁷

21
- 84890443834
- Real-life voice activity detection with LSTM recurrent neural networks and an application to Hollywood movies
- Vancouver, Canada, IEEE
- F. Eyben, F. Weninger, S. Squartini, and B. Schuller, "Real-life voice activity detection with LSTM recurrent neural networks and an application to Hollywood movies," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 483-487, IEEE.
- (2013) Proc. of ICASSP , pp. 483-487
- Eyben, F.¹ Weninger, F.² Squartini, S.³ Schuller, B.⁴

22
- 84890454069
- Acoustic Geo-Sensing: Recognising cyclists' route, route direction, and route progress from cell-phone audio
- Vancouver, Canada, IEEE
- B. Schuller, F. Pokorny, S. Ladstätter, M. Fellner, F. Graf, and L. Paletta, "Acoustic Geo-Sensing: Recognising cyclists' route, route direction, and route progress from cell-phone audio," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 453-457, IEEE.
- (2013) Proc. of ICASSP , pp. 453-457
- Schuller, B.¹ Pokorny, F.² Ladstätter, S.³ Fellner, M.⁴ Graf, F.⁵ Paletta, L.⁶

23
- 70449564650
- A binaural room impulse response database for the evaluation of dereverberation algorithms
- Santorini, Greece, July, IEEE
- M. Jeub, M. Schäfer, and P. Vary, "A binaural room impulse response database for the evaluation of dereverberation algorithms," in Proceedings of International Conference on Digital Signal Processing (DSP), Santorini, Greece, July 2009, pp. 1-4, IEEE.
- (2009) Proceedings of International Conference on Digital Signal Processing (DSP) , pp. 1-4
- Jeub, M.¹ Schäfer, M.² Vary, P.³

24
- 70349284484
- Ph.D. thesis, Technische Universität München
- A. Graves, Supervised sequence labelling with recurrent neural networks, Ph.D. thesis, Technische Universität München, 2008.
- (2008) Supervised Sequence Labelling with Recurrent Neural Networks
- Graves, A.¹

25
- 33744975847
- Performance measurement in blind audio source separation
- July
- E. Vincent, R. Gribonval, and C. Févotte, "Performance measurement in blind audio source separation," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 14, no. 4, pp. 1462-1469, July 2006.
- (2006) Audio, Speech, and Language Processing, IEEE Transactions on , vol.14 , Issue.4 , pp. 1462-1469
- Vincent, E.¹ Gribonval, R.² Févotte, C.³

26
- 70349227623
- Efficient musical noise suppression for speech enhancement system
- Taipei, Taiwan
- T. Esch and P. Vary, "Efficient musical noise suppression for speech enhancement system," in Proc. of ICASSP, Taipei, Taiwan, 2009, pp. 4409-4412.
- (2009) Proc. of ICASSP , pp. 4409-4412
- Esch, T.¹ Vary, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.