메뉴 건너뛰기




Volumn 23, Issue 12, 2015, Pages 2136-2147

Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

Author keywords

Deep recurrent neural network (DRNN); discriminative training; monaural source separation; time frequency masking

Indexed keywords

RECURRENT NEURAL NETWORKS; SEPARATION; SPEECH; SPEECH ANALYSIS;

EID: 84941334839     PISSN: 23299290     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASLP.2015.2468583     Document Type: Article
Times cited : (510)

References (40)
  • 4
    • 84946061883 scopus 로고    scopus 로고
    • Low-rank representation of both singing voice and music accompaniment via learned dictionaries
    • Y.-H. Yang, "Low-rank representation of both singing voice and music accompaniment via learned dictionaries, " in Proc. 14th Int. Soc. Music Inf. Retrieval Conf. (ISMIR), 2013.
    • (2013) Proc. 14th Int. Soc. Music Inf. Retrieval Conf. (ISMIR)
    • Yang, Y.-H.1
  • 5
    • 84871362349 scopus 로고    scopus 로고
    • On sparse and low-rank matrix decomposition for singing voice separation
    • Y.-H. Yang, "On sparse and low-rank matrix decomposition for singing voice separation, " in Proc. 20th ACM Int. Conf. Multimedia, 2012, pp. 757-760.
    • (2012) Proc. 20th ACM Int. Conf. Multimedia , pp. 757-760
    • Yang, Y.-H.1
  • 6
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectral subtraction
    • Apr.
    • S. Boll, "Suppression of acoustic noise in speech using spectral subtraction, " IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal Process., Vol. ASSP-27 , Issue.2 , pp. 113-120
    • Boll, S.1
  • 7
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
    • Dec.
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, " IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 8
    • 0033592606 scopus 로고    scopus 로고
    • Learning the parts of objects by non-negative matrix factorization
    • Oct.
    • D. D. Lee and H. S. Seung, "Learning the parts of objects by non-negative matrix factorization, " Nature, vol. 401, no. 6755, pp. 788-791, Oct. 1999.
    • (1999) Nature , vol.401 , Issue.6755 , pp. 788-791
    • Lee, D.D.1    Seung, H.S.2
  • 14
    • 85008542938 scopus 로고    scopus 로고
    • On the improvement of singing voice separation for monaural recordings using the MIR-1 K dataset
    • Feb.
    • C.-L. Hsu and J.-S. Jang, "On the improvement of singing voice separation for monaural recordings using the MIR-1 K dataset, " IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 2, pp. 310-319, Feb. 2010.
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.2 , pp. 310-319
    • Hsu, C.-L.1    Jang, J.-S.2
  • 16
    • 84923289508 scopus 로고    scopus 로고
    • A regression approach to speech enhancement based on deep neural networks
    • Jan.
    • Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee, "A regression approach to speech enhancement based on deep neural networks, " IEEE Trans. Audio, Speech, Lang. Process., vol. 23, no. 1, pp. 7-19, Jan. 2015.
    • (2015) IEEE Trans. Audio, Speech, Lang. Process. , vol.23 , Issue.1 , pp. 7-19
    • Xu, Y.1    Du, J.2    Dai, L.-R.3    Lee, C.-H.4
  • 18
    • 56249144201 scopus 로고    scopus 로고
    • Time-frequency masking for speech separation and its potential for hearing aid design
    • D. Wang, "Time-frequency masking for speech separation and its potential for hearing aid design, " Trends in Amplificat., vol. 12, pp. 332-353, 2008.
    • (2008) Trends in Amplificat. , vol.12 , pp. 332-353
    • Wang, D.1
  • 22
    • 84875678689 scopus 로고    scopus 로고
    • Towards scaling up classification-based speech separation
    • Jul.
    • Y. Wang and D. Wang, "Towards scaling up classification-based speech separation, " IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 7, pp. 1381-1390, Jul. 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , Issue.7 , pp. 1381-1390
    • Wang, Y.1    Wang, D.2
  • 23
  • 29
    • 0008554931 scopus 로고
    • A focused back-propagation algorithm for temporal pattern recognition
    • M. C. Mozer, "A focused back-propagation algorithm for temporal pattern recognition, " Complex Syst., vol. 3, no. 4, pp. 349-381, 1989.
    • (1989) Complex Syst. , vol.3 , Issue.4 , pp. 349-381
    • Mozer, M.C.1
  • 30
    • 0025503558 scopus 로고
    • Backpropagation through time: What it does and how to do it
    • Oct.
    • P. J. Werbos, "Backpropagation through time: What it does and how to do it, " Proc. IEEE, vol. 78, no. 10, pp. 1550-1560, Oct. 1990.
    • (1990) Proc. IEEE , vol.78 , Issue.10 , pp. 1550-1560
    • Werbos, P.J.1
  • 31
    • 0001765578 scopus 로고
    • Gradient-based learning algorithms for recurrent networks and their computational complexity
    • Mahwah, NJ, USA: Lawrence Erlbaum Associates
    • R. J. Williams and D. Zipser, "Gradient-based learning algorithms for recurrent networks and their computational complexity, " in Back-propagation: Theory, Architectures, and Applications. Mahwah, NJ, USA: Lawrence Erlbaum Associates, 1995, pp. 433-486.
    • (1995) Back-propagation: Theory, Architectures, and Applications , pp. 433-486
    • Williams, R.J.1    Zipser, D.2
  • 32
    • 3142694930 scopus 로고    scopus 로고
    • Blind separation of speech mixtures via time-frequency masking
    • Jul.
    • O. Yilmaz and S. Rickard, "Blind separation of speech mixtures via time-frequency masking, " IEEE Trans. Signal Process., vol. 52, no. 7, pp. 1830-1847, Jul. 2004.
    • (2004) IEEE Trans. Signal Process. , vol.52 , Issue.7 , pp. 1830-1847
    • Yilmaz, O.1    Rickard, S.2
  • 33
  • 35
    • 79960916745 scopus 로고    scopus 로고
    • An algorithm for intelligibility prediction of time-frequency weighted noisy speech
    • Sep.
    • C. Taal, R. Hendriks, R. Heusdens, and J. Jensen, "An algorithm for intelligibility prediction of time-frequency weighted noisy speech, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2125-2136, Sep. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.7 , pp. 2125-2136
    • Taal, C.1    Hendriks, R.2    Heusdens, R.3    Jensen, J.4
  • 36
    • 0000732463 scopus 로고
    • A limited memory algorithm for bound constrained optimization
    • Sep.
    • R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, "A limited memory algorithm for bound constrained optimization, " SIAM J. Sci. Comput., vol. 16, no. 5, pp. 1190-1208, Sep. 1995.
    • (1995) SIAM J. Sci. Comput. , vol.16 , Issue.5 , pp. 1190-1208
    • Byrd, R.H.1    Lu, P.2    Nocedal, J.3    Zhu, C.4
  • 37
    • 84874282188 scopus 로고    scopus 로고
    • Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM
    • J. Li, D. Yu, J.-T. Huang, and Y. Gong, "Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM, " in Proc. IEEE Spoken Lang. Technol. Workshop (SLT), 2012, pp. 131-136.
    • (2012) Proc. IEEE Spoken Lang. Technol. Workshop (SLT) , pp. 131-136
    • Li, J.1    Yu, D.2    Huang, J.-T.3    Gong, Y.4
  • 38
    • 51449094735 scopus 로고    scopus 로고
    • Adaptation of Bayesian models for single-channel source separation and its application to voice/music separation in popular songs
    • Jul.
    • A. Ozerov, P. Philippe, F. Bimbot, and R. Gribonval, "Adaptation of Bayesian models for single-channel source separation and its application to voice/music separation in popular songs, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 5, pp. 1564-1578, Jul. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.5 , pp. 1564-1578
    • Ozerov, A.1    Philippe, P.2    Bimbot, F.3    Gribonval, R.4
  • 40
    • 0031573117 scopus 로고    scopus 로고
    • Long short-term memory
    • Nov.
    • S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural Comput., vol. 9, no. 8, pp. 1735-1780, Nov. 1997.
    • (1997) Neural Comput. , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.