메뉴 건너뛰기




Volumn , Issue , 2014, Pages 3709-3713

Single-channel speech separation with memory-enhanced recurrent neural networks

Author keywords

Long Short Term Memory; recurrent neural networks; Speech enhancement; speech separation

Indexed keywords

AUDIO ACOUSTICS; BRAIN; IMPULSE RESPONSE; NOISE ABATEMENT; RECURRENT NEURAL NETWORKS; SIGNAL PROCESSING; SIGNAL TO NOISE RATIO; SPEECH ANALYSIS; SPEECH ENHANCEMENT; SPEECH RECOGNITION;

EID: 84905284062     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2014.6854294     Document Type: Conference Paper
Times cited : (136)

References (26)
  • 1
    • 84890543083 scopus 로고    scopus 로고
    • Speech recognition with deep recurrent neural networks
    • Vancouver, Canada, May,IEEE
    • A. Graves, A. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. of ICASSP, Vancouver, Canada, May 2013, pp. 6645-6649, IEEE.
    • (2013) Proc. of ICASSP , pp. 6645-6649
    • Graves, A.1    Mohamed, A.2    Hinton, G.3
  • 2
    • 84867593805 scopus 로고    scopus 로고
    • Polyphonic piano note transcription with recurrent neural networks
    • Kyoto, Japan
    • S. Böck and M. Schedl, "Polyphonic piano note transcription with recurrent neural networks," in Proc. of ICASSP, Kyoto, Japan, 2012, pp. 121-124.
    • (2012) Proc. of ICASSP , pp. 121-124
    • Böck, S.1    Schedl, M.2
  • 5
    • 84890489927 scopus 로고    scopus 로고
    • Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly non-stationary noise
    • Vancouver, Canada
    • M. Wöllmer, Z. Zhang, F. Weninger, B. Schuller, and G. Rigoll, "Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly non-stationary noise," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 6822-6826.
    • (2013) Proc. of ICASSP , pp. 6822-6826
    • Wöllmer, M.1    Zhang, Z.2    Weninger, F.3    Schuller, B.4    Rigoll, G.5
  • 6
    • 56449089103 scopus 로고    scopus 로고
    • Extracting and composing robust features with denoising autoencoders
    • P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. of ICML, 2008, pp. 1096-1103.
    • (2008) Proc. of ICML , pp. 1096-1103
    • Vincent, P.1    Larochelle, H.2    Bengio, Y.3    Manzagol, P.4
  • 7
    • 84906262433 scopus 로고    scopus 로고
    • Speech enhancement based on deep denoising autoencoder
    • Lyon, France
    • X. Lu, Y. Tsao, S. Matsuda, and C. Hori, "Speech enhancement based on deep denoising autoencoder," in Proc. of INTERSPEECH, Lyon, France, 2013, pp. 3444-3448.
    • (2013) Proc. of INTERSPEECH , pp. 3444-3448
    • Lu, X.1    Tsao, Y.2    Matsuda, S.3    Hori, C.4
  • 8
    • 84906279378 scopus 로고    scopus 로고
    • Speech enhancement with weighted denoising auto-encoder
    • Lyon, France
    • B.Y. Xia and C.C. Bao, "Speech enhancement with weighted denoising auto-encoder," in Proc. of INTERSPEECH, Lyon, France, 2013, pp. 436-440.
    • (2013) Proc. of INTERSPEECH , pp. 436-440
    • Xia, B.Y.1    Bao, C.C.2
  • 9
    • 0035396555 scopus 로고    scopus 로고
    • Noise power spectral density estimation based on optimal smoothing and minimum statistics
    • July
    • R. Martin, "Noise Power Spectral Density Estimation Based on Optimal Smoothing and Minimum Statistics," IEEE Transactions on Audio, Speech, and Language Processing, vol. 9, no. 5, pp. 504-512, July 2001.
    • (2001) IEEE Transactions on Audio, Speech, and Language Processing , vol.9 , Issue.5 , pp. 504-512
    • Martin, R.1
  • 11
    • 0035112152 scopus 로고    scopus 로고
    • Nonlinear blind source separation using a radial basis function network
    • Y. Tan, J.Wang, and J.M. Zurada, "Nonlinear blind source separation using a radial basis function network," IEEE Transactions on Neural Networks, vol. 12, no. 1, pp. 124-134, 2001.
    • (2001) IEEE Transactions on Neural Networks , vol.12 , Issue.1 , pp. 124-134
    • Tan, Y.1    Wang, J.2    Zurada, J.M.3
  • 13
    • 84890493989 scopus 로고    scopus 로고
    • Ideal ratio mask estimation using deep neural networks for robust speech recognition
    • Vancouver, Canada
    • A. Narayanan and D. Wang, "Ideal ratio mask estimation using deep neural networks for robust speech recognition," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 7092-7096.
    • (2013) Proc. of ICASSP , pp. 7092-7096
    • Narayanan, A.1    Wang, D.2
  • 14
    • 84890489552 scopus 로고    scopus 로고
    • Integrating noise estimation and factorization-based speech separation: A novel hybrid approach
    • Vancouver, Canada, IEEE
    • C. Joder, F. Weninger, D. Virette, and B. Schuller, "Integrating Noise Estimation and Factorization-based Speech Separation: A Novel Hybrid Approach," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 131-135, IEEE.
    • (2013) Proc. of ICASSP , pp. 131-135
    • Joder, C.1    Weninger, F.2    Virette, D.3    Schuller, B.4
  • 15
    • 77950116181 scopus 로고    scopus 로고
    • Factorial scaled hidden Markov model for polyphonic audio representation and source separation
    • Mohonk, NY, USA
    • A. Ozerov, C. Févotte, and M. Charbit, "Factorial scaled hidden Markov model for polyphonic audio representation and source separation," in Proc. of WASPAA, Mohonk, NY, USA, 2009, pp. 121-124.
    • (2009) Proc. of WASPAA , pp. 121-124
    • Ozerov, A.1    Févotte, C.2    Charbit, M.3
  • 16
    • 80051625972 scopus 로고    scopus 로고
    • A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics
    • Prague, Czech Republic
    • G.J. Mysore and P. Smaragdis, "A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics," in Proc. of ICASSP, Prague, Czech Republic, 2011, pp. 17-20.
    • (2011) Proc. of ICASSP , pp. 17-20
    • Mysore, G.J.1    Smaragdis, P.2
  • 17
    • 0034293152 scopus 로고    scopus 로고
    • Learning to forget: Continual prediction with LSTM
    • F. Gers, J. Schmidhuber, and F. Cummins, "Learning to forget: Continual prediction with LSTM," Neural Computation, vol. 12, no. 10, pp. 2451-2471, 2000.
    • (2000) Neural Computation , vol.12 , Issue.10 , pp. 2451-2471
    • Gers, F.1    Schmidhuber, J.2    Cummins, F.3
  • 21
    • 84890443834 scopus 로고    scopus 로고
    • Real-life voice activity detection with LSTM recurrent neural networks and an application to Hollywood movies
    • Vancouver, Canada, IEEE
    • F. Eyben, F. Weninger, S. Squartini, and B. Schuller, "Real-life voice activity detection with LSTM recurrent neural networks and an application to Hollywood movies," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 483-487, IEEE.
    • (2013) Proc. of ICASSP , pp. 483-487
    • Eyben, F.1    Weninger, F.2    Squartini, S.3    Schuller, B.4
  • 22
    • 84890454069 scopus 로고    scopus 로고
    • Acoustic Geo-Sensing: Recognising cyclists' route, route direction, and route progress from cell-phone audio
    • Vancouver, Canada, IEEE
    • B. Schuller, F. Pokorny, S. Ladstätter, M. Fellner, F. Graf, and L. Paletta, "Acoustic Geo-Sensing: Recognising cyclists' route, route direction, and route progress from cell-phone audio," in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 453-457, IEEE.
    • (2013) Proc. of ICASSP , pp. 453-457
    • Schuller, B.1    Pokorny, F.2    Ladstätter, S.3    Fellner, M.4    Graf, F.5    Paletta, L.6
  • 26
    • 70349227623 scopus 로고    scopus 로고
    • Efficient musical noise suppression for speech enhancement system
    • Taipei, Taiwan
    • T. Esch and P. Vary, "Efficient musical noise suppression for speech enhancement system," in Proc. of ICASSP, Taipei, Taiwan, 2009, pp. 4409-4412.
    • (2009) Proc. of ICASSP , pp. 4409-4412
    • Esch, T.1    Vary, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.