메뉴 건너뛰기




Volumn , Issue , 2017, Pages 261-265

Improving music source separation based on deep neural networks through data augmentation and network blending

Author keywords

Blending; Deep neural network (DNN); Long short term memory (LSTM); Music source separation (MSS)

Indexed keywords


EID: 85023774072     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2017.7952158     Document Type: Conference Paper
Times cited : (251)

References (43)
  • 1
    • 84867975114 scopus 로고    scopus 로고
    • Repeating pattern extraction technique (REPET): A simple method for music/voice separation
    • Z. Rafii and B. Pardo, "Repeating pattern extraction technique (REPET): A simple method for music/voice separation," IEEE Trans, on Audio, Speech, and Language Processing, Vol. 21, no. 1, pp. 73-84, 2013.
    • (2013) IEEE Trans, on Audio, Speech, and Language Processing , vol.21 , Issue.1 , pp. 73-84
    • Rafii, Z.1    Pardo, B.2
  • 2
    • 80052984197 scopus 로고    scopus 로고
    • A musically motivated mid-level representation for pitch estimation and musical audio source separation
    • J.-L. Durrieu, B. David, and G. Richard, "A musically motivated mid-level representation for pitch estimation and musical audio source separation," IEEE Journal on Selected Topics on Signal Processing, Vol. 5, pp. 1180-1191, 2011.
    • (2011) IEEE Journal on Selected Topics on Signal Processing , vol.5 , pp. 1180-1191
    • Durrieu, J.-L.1    David, B.2    Richard, G.3
  • 6
    • 85023758493 scopus 로고    scopus 로고
    • "SiSEC MUS Homepage," https://sisec.inria.fr/home/2016-professionally-produced-music-recordings/.
    • SiSEC MUS Homepage
  • 9
    • 85013463259 scopus 로고    scopus 로고
    • Multichannel music separation with deep neural networks
    • A. A. Nugraha, A. Liutkus, and E. Vincent, "Multichannel music separation with deep neural networks," in Proc. EUSIPCO, 2016.
    • (2016) Proc. EUSIPCO
    • Nugraha, A.A.1    Liutkus, A.2    Vincent, E.3
  • 10
    • 84944676216 scopus 로고    scopus 로고
    • Deep neural network based instrument extraction from music
    • S. Uhlich, F. Giron, and Y. Mitsufuji, "Deep neural network based instrument extraction from music," in Proc. ICASSP, 2015, pp. 2135-2139.
    • (2015) Proc. ICASSP , pp. 2135-2139
    • Uhlich, S.1    Giron, F.2    Mitsufuji, Y.3
  • 11
    • 84930630277 scopus 로고    scopus 로고
    • Deep learning
    • Y. LeCun, Y Bengio, and G. Hinton, "Deep learning," Nature, Vol. 521, no. 7553, pp. 436-444, 2015.
    • (2015) Nature , vol.521 , Issue.7553 , pp. 436-444
    • Lecun, Y.1    Bengio, Y.2    Hinton, G.3
  • 13
    • 85046988721 scopus 로고    scopus 로고
    • Singing-voice separation from monaural recordings using deep recurrent neural networks
    • P.-S. Huang, M. Kim, M. Hasegawa-Johnson, and P. Smaragdis, "Singing-voice separation from monaural recordings using deep recurrent neural networks.," in Proc. ISMIR, 2014, pp. 477-482.
    • (2014) Proc. ISMIR , pp. 477-482
    • Huang, P.-S.1    Kim, M.2    Hasegawa-Johnson, M.3    Smaragdis, P.4
  • 15
    • 84944681228 scopus 로고    scopus 로고
    • Deep karaoke: Extracting vocals from musical mixtures using a convolutional deep neural network
    • A. J. Simpson, G. Roma, and M. D. Plumbley, "Deep karaoke: Extracting vocals from musical mixtures using a convolutional deep neural network," in Proc. LVA/ICA, 2015, pp. 429-436.
    • (2015) Proc. LVA/ICA , pp. 429-436
    • Simpson, A.J.1    Roma, G.2    Plumbley, M.D.3
  • 16
  • 19
    • 84994242533 scopus 로고    scopus 로고
    • Combining mask estimates for single channel audio source separation using deep neural networks
    • E. M. Grais, G. Roma, A. J. R. Simpson, and M. D. Plumbley, "Combining mask estimates for single channel audio source separation using deep neural networks," in Proc. Interspeech, 2016.
    • (2016) Proc. Interspeech
    • Grais, E.M.1    Roma, G.2    Simpson, A.J.R.3    Plumbley, M.D.4
  • 21
    • 84959157364 scopus 로고    scopus 로고
    • Multi-resolution stacking for speech separation based on boosted DNN
    • X.-L. Zhang and D. Wang, "Multi-resolution stacking for speech separation based on boosted DNN," in Proc. Inter-speech, 2015, pp. 1745-1749.
    • (2015) Proc. Inter-speech , pp. 1745-1749
    • Zhang, X.-L.1    Wang, D.2
  • 23
    • 84946079770 scopus 로고    scopus 로고
    • Cross-domain cooperative deep stacking network for speech separation
    • IEEE
    • W Jiang, S. Liang, L. Dong, H. Yang, W Liu, and Y Wang, "Cross-domain cooperative deep stacking network for speech separation," in Proc. ICASSP. IEEE, 2015, pp. 5083-5087.
    • (2015) Proc. ICASSP , pp. 5083-5087
    • Jiang, W.1    Liang, S.2    Dong, L.3    Yang, H.4    Liu, W.5    Wang, Y.6
  • 24
    • 10944227316 scopus 로고    scopus 로고
    • Sparse coding and NMF
    • J. Eggert and E. Körner, "Sparse coding and NMF," in Proc. Neural Networks, 2004, vol. 4, pp. 2529-2533.
    • (2004) Proc. Neural Networks , vol.4 , pp. 2529-2533
    • Eggert, J.1    Körner, E.2
  • 25
    • 63249085556 scopus 로고    scopus 로고
    • Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis
    • C. Févotte, N. Bertin, and J.-L. Durrieu, "Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis," Neural computation, Vol. 21, no. 3, pp. 793-830, 2009.
    • (2009) Neural Computation , vol.21 , Issue.3 , pp. 793-830
    • Févotte, C.1    Bertin, N.2    Durrieu, J.-L.3
  • 26
    • 84910065215 scopus 로고    scopus 로고
    • Discriminative NMF and its application to single-channel source separation
    • F. Weninger, J. Le Roux, J. R. Hershey, and S. Watanabe, "Discriminative NMF and its application to single-channel source separation," in Proc. Interspeech, 2014, pp. 865-869.
    • (2014) Proc. Interspeech , pp. 865-869
    • Weninger, F.1    Le Roux, J.2    Hershey, J.R.3    Watanabe, S.4
  • 29
    • 77955675017 scopus 로고    scopus 로고
    • Under-determined reverberant audio source separation using a full-rank spatial covariance model
    • N. Q. Duong, E. Vincent, and R. Gribonval, "Under-determined reverberant audio source separation using a full-rank spatial covariance model," IEEE Trans, on Audio, Speech, and Language Processing, Vol. 18, no. 7, pp. 1830-1840, 2010.
    • (2010) IEEE Trans, on Audio, Speech, and Language Processing , vol.18 , Issue.7 , pp. 1830-1840
    • Duong, N.Q.1    Vincent, E.2    Gribonval, R.3
  • 30
    • 84897584695 scopus 로고    scopus 로고
    • A general flexible framework for the handling of prior information in audio source separation
    • A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework for the handling of prior information in audio source separation," IEEE Trans, on Audio, Speech, and Language Processing, Vol. 20, no. 4, pp. 1118-1133, 2012.
    • (2012) IEEE Trans, on Audio, Speech, and Language Processing , vol.20 , Issue.4 , pp. 1118-1133
    • Ozerov, A.1    Vincent, E.2    Bimbot, F.3
  • 33
    • 84862294866 scopus 로고    scopus 로고
    • Deep sparse rectifier networks
    • X. Glorot, A. Bordes, and Y Bengio, "Deep sparse rectifier networks," Proc. AISTATS, Vol. 15, pp. 315-323, 2011.
    • (2011) Proc. AISTATS , vol.15 , pp. 315-323
    • Glorot, X.1    Bordes, A.2    Bengio, Y.3
  • 36
    • 27744588611 scopus 로고    scopus 로고
    • Framewise phoneme classification with bidirectional LSTM and other neural network architectures
    • A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," Neural Networks, Vol. 18, no. 5, pp. 602-610, 2005.
    • (2005) Neural Networks , vol.18 , Issue.5 , pp. 602-610
    • Graves, A.1    Schmidhuber, J.2
  • 37
    • 85023742699 scopus 로고    scopus 로고
    • "Lasagne GitHub," https://github.com/Lasagne/Lasagne.
    • Lasagne GitHub
  • 38
    • 85023748945 scopus 로고    scopus 로고
    • "Theano GitHub," https://github.com/Theano/Theano.
    • Theano GitHub
  • 40
    • 84973376414 scopus 로고    scopus 로고
    • Exploring data augmentation for improved singing voice detection with neural networks
    • J. Schlüter and T. Grill, "Exploring data augmentation for improved singing voice detection with neural networks," in Proc. ISMIR, 2015.
    • (2015) Proc. ISMIR
    • Schlüter, J.1    Grill, T.2
  • 41
    • 84996516893 scopus 로고    scopus 로고
    • A software framework for musical data augmentation
    • B. McFee, E. J. Humphrey, and J. P. Bello, "A software framework for musical data augmentation," in Proc. ISMIR, 2015.
    • (2015) Proc. ISMIR
    • McFee, B.1    Humphrey, E.J.2    Bello, J.P.3
  • 43
    • 57349146373 scopus 로고    scopus 로고
    • Lessons from the netflix prize challenge
    • R. M. Bell and Y Koren, "Lessons from the Netflix prize challenge," ACM SIGKDD Explorations Newsletter, Vol. 9, no. 2, pp. 75-79, 2007.
    • (2007) ACM SIGKDD Explorations Newsletter , vol.9 , Issue.2 , pp. 75-79
    • Bell, R.M.1    Koren, Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.