메뉴 건너뛰기




Volumn 18, Issue 2, 2010, Pages 310-319

On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset

Author keywords

Computational auditory scene analysis (CASA); singing voice separation; unvoiced sound separation

Indexed keywords


EID: 85008542938     PISSN: 15587916     EISSN: 15587924     Source Type: Journal    
DOI: 10.1109/TASL.2009.2026503     Document Type: Article
Times cited : (256)

References (32)
  • 1
    • 85009187525 scopus 로고    scopus 로고
    • An automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker
    • Geneva, Switzerland
    • C. K. Wang, R. Y. Lyu, and Y. C. Chiang, “An automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker,” in Proc. 8th Eur. Conf. Speech Commun. Technol., Geneva, Switzerland, 2003, pp. 1197–1200.
    • (2003) Proc. 8th Eur. Conf. Speech Commun. Technol. , pp. 1197-1200
    • Wang, C.K.1    Lyu, R.Y.2    Chiang, Y.C.3
  • 3
    • 77957274549 scopus 로고    scopus 로고
    • Disambiguating music emotion using software agents
    • Barcelona, Spain
    • D. Yang and W. Lee, “Disambiguating music emotion using software agents,” in Proc. Symp. Music Inf. Retrieval (ISMIR'04), Barcelona, Spain, 2004, pp. 52–57.
    • (2004) Proc. Symp. Music Inf. Retrieval (ISMIR'04) , pp. 52-57
    • Yang, D.1    Lee, W.2
  • 5
    • 50549089895 scopus 로고    scopus 로고
    • Separation of singing voice from music accompaniment for monaural recordings
    • Y. Li and D. L. Wang, “Separation of singing voice from music accompaniment for monaural recordings,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, pp. 1475–1487, 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , pp. 1475-1487
    • Li, Y.1    Wang, D.L.2
  • 9
    • 84870889437 scopus 로고    scopus 로고
    • Timing is of the essence: Neural oscillator models of auditory grouping
    • Lawrence Erlbaum, NJ: Mahwah
    • G. J. Brown and D. L. Wang, S. Greenberg and W. Ainsworth, Eds., “Timing is of the essence: Neural oscillator models of auditory grouping,” in Listening to Speech: An Auditory Perspective. Lawrence Erlbaum, NJ: Mahwah, 2006, pp. 375–392.
    • (2006) Listening to Speech: An Auditory Perspective , pp. 375-392
    • Brown, G.J.1    Wang, D.L.2    Greenberg, S.3    Ainsworth, W.4
  • 10
    • 46049084696 scopus 로고    scopus 로고
    • An auditory scene analysis approach to monaural speech segregation
    • Heidelberg, Germany: Springer
    • G. Hu and D. L. Wang, E. Hansler and G. Schmidt, Eds., “An auditory scene analysis approach to monaural speech segregation,” in Acoustic Echo and Noise Control. Heidelberg, Germany: Springer, 2006, pp. 485–515.
    • (2006) Acoustic Echo and Noise Control , pp. 485-515
    • Hu, G.1    Wang, D.L.2    Hansler, E.3    Schmidt, G.4
  • 11
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • Norwell, MA: Kluwer
    • D. L. Wang, P. Divenyi, Ed., “On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines. Norwell, MA: Kluwer, 2005, pp. 181–197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1    Divenyi, P.2
  • 12
    • 49249107353 scopus 로고    scopus 로고
    • Segregation of unvoiced speech from non-speech interference
    • G. Hu and D. L. Wang, “Segregation of unvoiced speech from non-speech interference,” J. Acoust. Soc. Amer., vol. 124, pp. 1306–1319, 2008.
    • (2008) J. Acoust. Soc. Amer. , vol.124 , pp. 1306-1319
    • Hu, G.1    Wang, D.L.2
  • 13
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Sep.
    • G. Hu and D. L. Wang, “Monaural speech segregation based on pitch tracking and amplitude modulation,” IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135–1150, Sep. 2004.
    • (2004) IEEE Trans. Neural Netw. , vol.15 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 14
    • 33745190137 scopus 로고    scopus 로고
    • Ph.D. dissertation, Media Lab., Mass. Inst. Technol., Cambridge, MA
    • Y. E. Kim, “Singing voice analysis/synthesis” Ph.D. dissertation, Media Lab., Mass. Inst. Technol., Cambridge, MA, 2003.
    • (2003) Singing voice analysis/synthesis
    • Kim, Y.E.1
  • 15
    • 34547508425 scopus 로고    scopus 로고
    • Automatic synchronization between lyrics and music CD recordings based on viterbi alignment of segregated vocal signals
    • H. Fujihara, M. Goto, O. Jun, K. Komatani, T. Ogata, and H. G. Okuno, “Automatic synchronization between lyrics and music CD recordings based on viterbi alignment of segregated vocal signals,” in Proc. IEEE Int. Symp. Multimedia (ISM 2006), 2006, pp. 257–264.
    • (2006) Proc. IEEE Int. Symp. Multimedia (ISM 2006) , pp. 257-264
    • Fujihara, H.1    Goto, M.2    Jun, O.3    Komatani, K.4    Ogata, T.5    Okuno, H.G.6
  • 16
    • 51449099173 scopus 로고    scopus 로고
    • Three techniques for improving automatic synchronization between music and lyrics: Fricative sound detection, filler model, and novel feature vectors for vocal activity detection
    • Las Vegas, NV, Mar.–Apr.
    • H. Fujihara and M. Goto, “Three techniques for improving automatic synchronization between music and lyrics: Fricative sound detection, filler model, and novel feature vectors for vocal activity detection,” in Proc. 2008 IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'08), Las Vegas, NV, Mar.–Apr. 2008, pp. 69–72.
    • (2008) Proc. 2008 IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'08) , pp. 69-72
    • Fujihara, H.1    Goto, M.2
  • 17
    • 54049086684 scopus 로고    scopus 로고
    • Accompaniment separation and karaoke application based on automatic melody transcription
    • Hannover, Germany, Jun.
    • M. Ryynanen, T. Virtanen, J. Paulus, and A. Klapuri, “Accompaniment separation and karaoke application based on automatic melody transcription,” in Proc. 2008 IEEE Int. Conf. Multimedia Expo (ICME'08), Hannover, Germany, Jun. 2008, pp. 1417–1420.
    • (2008) Proc. 2008 IEEE Int. Conf. Multimedia Expo (ICME'08) , pp. 1417-1420
    • Ryynanen, M.1    Virtanen, T.2    Paulus, J.3    Klapuri, A.4
  • 18
    • 33745220723 scopus 로고    scopus 로고
    • On designing and evaluating speech event detectors
    • Lisbon, Portugal, Sep.
    • J. Li and C.-H. Lee, “On designing and evaluating speech event detectors,” in Proc. Inter Speech, Lisbon, Portugal, Sep. 2005, pp. 3365–3368.
    • (2005) Proc. Inter Speech , pp. 3365-3368
    • Li, J.1    Lee, C.-H.2
  • 19
    • 84866491801 scopus 로고    scopus 로고
    • An auditory streaming approach on melody extraction
    • Victoria, BC, Canada, Sept.
    • K. Dressier, “An auditory streaming approach on melody extraction,” in Extended Abstract for ISMIR 2006, Victoria, BC, Canada, Sept. 8–12, 2006.
    • (2006) Extended Abstract for ISMIR 2006 , pp. 8-12
    • Dressier, K.1
  • 20
    • 84872702570 scopus 로고    scopus 로고
    • Sinusoidal extraction using an efficient implementation of a multi-resolution FFT
    • Montreal, Quebec, Canada, Sep. 18–20
    • K. Dressier, “Sinusoidal extraction using an efficient implementation of a multi-resolution FFT,” in Proc. Int. Conf Digital Audio Effects (DAFx-06), Montreal, Quebec, Canada, Sep. 18–20, 2006, pp. 247–252.
    • (2006) Proc. Int. Conf Digital Audio Effects (DAFx-06) , pp. 247-252
    • Dressier, K.1
  • 22
    • 85008519458 scopus 로고    scopus 로고
    • [Online]. Available: http://dea.brunel.ac.uk/cmsp/Home_Esfandiar/Sample wave Files.htm 2005
    • E. Zavarehei, Sample Speech Enhancement Methods, [Online]. Available: http://dea.brunel.ac.uk/cmsp/Home_Esfandiar/Sample wave Files.htm 2005
    • Sample Speech Enhancement Methods
    • Zavarehei, E.1
  • 23
    • 0003982501 scopus 로고
    • A theory and computational model of auditory monaural sound separation
    • Ph.D. dissertation, Dept. Elect. Eng., Stanford Univ., Stanford, CA
    • M. Weintraub, “A theory and computational model of auditory monaural sound separation,” Ph.D. dissertation, Dept. Elect. Eng., Stanford Univ., Stanford, CA, 1985.
    • (1985)
    • Weintraub, M.1
  • 26
    • 0020102027 scopus 로고
    • Least squares quantization in PCM
    • Mar.
    • S. P. Lloyd “Least squares quantization in PCM,” IEEE Trans. Inf. Theory, vol. IT-28, no. 2, pp. 129–137, Mar. 1982.
    • (1982) IEEE Trans. Inf. Theory , vol.IT-28 , Issue.2 , pp. 129-137
    • Lloyd, S.P.1
  • 27
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • A. P. Dempster, N. M. Laird, and D. B. Rubin “Maximum likelihood from incomplete data via the EM algorithm,” J. R. Statist. Soc., vol. 39, pp. 1–38, 1977.
    • (1977) J. R. Statist. Soc. , vol.39 , pp. 1-38
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 28
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected application in speech recognition
    • Feb.
    • L. R. Rabiner, “A tutorial on hidden Markov models and selected application in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 29
    • 0348196088 scopus 로고    scopus 로고
    • Proposals for performance measurement in source separation
    • Nara, Apr.
    • R. Gribonval, L. Benaroya, E. Vincent, and C. Fevotte, “Proposals for performance measurement in source separation,” in Proc. Int. Symp. ICA BSS, Nara, Apr. 2003, pp. 763–768.
    • (2003) Proc. Int. Symp. ICA BSS , pp. 763-768
    • Gribonval, R.1    Benaroya, L.2    Vincent, E.3    Fevotte, C.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.