메뉴 건너뛰기




Volumn 16, Issue 5, 2014, Pages 1188-1200

A systematic evaluation of the bag-of-frames representation for music information retrieval

Author keywords

Bag of frames model; music information retrieval; sparse coding; unsupervised feature learning

Indexed keywords

COMPUTER MUSIC; INFORMATION RETRIEVAL; STATISTICS;

EID: 84904754003     PISSN: 15209210     EISSN: None     Source Type: Journal    
DOI: 10.1109/TMM.2014.2311016     Document Type: Article
Times cited : (39)

References (64)
  • 1
    • 33644513420 scopus 로고    scopus 로고
    • Efficient auditory coding
    • DOI 10.1038/nature04485, PII N04485
    • E. C. Smith and M. S. Lewicki, "Efficient auditory coding," Nature, vol. 439, no. 7079, pp. 978-982, 2006. (Pubitemid 43292416)
    • (2006) Nature , vol.439 , Issue.7079 , pp. 978-982
    • Smith, E.C.1    Lewicki, M.S.2
  • 2
    • 78650891032 scopus 로고    scopus 로고
    • On the use of sparse time-relative auditory codes for music
    • P. Manzagol, T. Bertin-Mahieux, and D. Eck, "On the use of sparse time-relative auditory codes for music," in Proc. ISMIR, 2008, pp. 14-18.
    • (2008) Proc. ISMIR , pp. 14-18
    • Manzagol, P.1    Bertin-Mahieux, T.2    Eck, D.3
  • 3
    • 78149304826 scopus 로고    scopus 로고
    • Sound retrieval and ranking using sparse auditory representations
    • R. F. Lyon, M. Rehn, S. Bengio, T. C. Walters, and G. Chechik, "Sound retrieval and ranking using sparse auditory representations," Neural Computat., vol. 20, pp. 2390-2416, 2010.
    • (2010) Neural Computat. , vol.20 , pp. 2390-2416
    • Lyon, R.F.1    Rehn, M.2    Bengio, S.3    Walters, T.C.4    Chechik, G.5
  • 4
    • 77952744810 scopus 로고    scopus 로고
    • Sparse representations in audio and music: From coding to source separation
    • M. D. Plumbley, T. Blumensath, L. Daudet, R. Gribonval, and M. E. Davies, "Sparse representations in audio and music: From coding to source separation," Proc. IEEE, vol. 98, no. 6, pp. 995-1005, 2010.
    • (2010) Proc. IEEE , vol.98 , Issue.6 , pp. 995-1005
    • Plumbley, M.D.1    Blumensath, T.2    Daudet, L.3    Gribonval, R.4    Davies, M.E.5
  • 5
    • 84863380535 scopus 로고    scopus 로고
    • Unsupervised feature learning for audio classification using convolutional deep belief networks
    • H. Lee, Y. Largman, P. Pham, and A. Y. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Proc. NIPS, 2009, pp. 1096-1104.
    • (2009) Proc. NIPS , pp. 1096-1104
    • Lee, H.1    Largman, Y.2    Pham, P.3    Ng, A.Y.4
  • 6
    • 84873584268 scopus 로고    scopus 로고
    • Learning features from music audio with deep belief networks
    • P. Hamel and D. Eck, "Learning features from music audio with deep belief networks," in Proc. ISMIR, 2010, pp. 339-344.
    • (2010) Proc. ISMIR , pp. 339-344
    • Hamel, P.1    Eck, D.2
  • 7
    • 84873578548 scopus 로고    scopus 로고
    • A classification-based polyphonic piano transcription approach using learned feature representations
    • J. Nam, J. Ngiam, H. Lee, and M. Slaney, "A classification-based polyphonic piano transcription approach using learned feature representations," in Proc. ISMIR, 2011, pp. 175-180.
    • (2011) Proc. ISMIR , pp. 175-180
    • Nam, J.1    Ngiam, J.2    Lee, H.3    Slaney, M.4
  • 8
    • 80051744215 scopus 로고    scopus 로고
    • Sparse approximations for drum sound classification
    • S. Scholler and H. Purwins, "Sparse approximations for drum sound classification," IEEE J. Select. Topics Signal Process., vol. 5, no. 5, pp. 933-940, 2011.
    • (2011) IEEE J. Select. Topics Signal Process. , vol.5 , Issue.5 , pp. 933-940
    • Scholler, S.1    Purwins, H.2
  • 9
    • 84873602768 scopus 로고    scopus 로고
    • Audio based music classification with a pre trained convolutional network
    • S. Dieleman, P. Brakel, and B. Schrauwen, "Audio based music classification with a pre trained convolutional network," in Proc. ISMIR, 2011, pp. 669-674.
    • (2011) Proc. ISMIR , pp. 669-674
    • Dieleman, S.1    Brakel, P.2    Schrauwen, B.3
  • 10
    • 84864122549 scopus 로고    scopus 로고
    • Unsupervised learning of sparse features for scalable audio classification
    • M. Henaff, K. Jarrett, K. Kavukcuoglu, and Y. LeCun, "Unsupervised learning of sparse features for scalable audio classification," in Proc. ISMIR, 2011, pp. 681-686.
    • (2011) Proc. ISMIR , pp. 681-686
    • Henaff, M.1    Jarrett, K.2    Kavukcuoglu, K.3    Lecun, Y.4
  • 11
    • 84873420121 scopus 로고    scopus 로고
    • Feature learning in dynamic environments: Modeling the acoustic structure of musical emotion
    • E. M. Schmidt, J. Scott, and Y. E. Kim, "Feature learning in dynamic environments: Modeling the acoustic structure of musical emotion," in Proc. ISMIR, 2012, pp. 325-330.
    • (2012) Proc. ISMIR , pp. 325-330
    • Schmidt, E.M.1    Scott, J.2    Kim, Y.E.3
  • 13
    • 84864120028 scopus 로고    scopus 로고
    • Supervised dictionary learning for music genre classification
    • C.-C. M. Yeh and Y.-H. Yang, "Supervised dictionary learning for music genre classification," in Proc. ACM ICMR, 2012.
    • (2012) Proc. ACM ICMR
    • Yeh, C.-C.M.1    Yang, Y.-H.2
  • 14
    • 84873444848 scopus 로고    scopus 로고
    • Learning sparse feature representations for music annotation and retrieval
    • J. Nam, J. Herrera, M. Slaney, and J. Smith, "Learning sparse feature representations for music annotation and retrieval," in Proc. ISMIR, 2012, pp. 565-560.
    • (2012) Proc. ISMIR , pp. 565-560
    • Nam, J.1    Herrera, J.2    Slaney, M.3    Smith, J.4
  • 15
    • 84873426072 scopus 로고    scopus 로고
    • Analyzing drum patterns using conditional deep belief networks
    • E. Battenberg and D. Wessel, "Analyzing drum patterns using conditional deep belief networks," in Proc. ISMIR, 2012, pp. 37-42.
    • (2012) Proc. ISMIR , pp. 37-42
    • Battenberg, E.1    Wessel, D.2
  • 16
    • 84904727378 scopus 로고    scopus 로고
    • Multipitch estimation of piano music by exemplar-based sparse representation
    • to be published
    • C.-T. Lee, Y.-H. Yang, and H. H. Chen, "Multipitch estimation of piano music by exemplar-based sparse representation," IEEE Trans. Multimedia, to be published.
    • IEEE Trans. Multimedia
    • Lee, C.-T.1    Yang, Y.-H.2    Chen, H.H.3
  • 17
    • 84873453413 scopus 로고    scopus 로고
    • Deep architectures and automatic feature learning in music informatics
    • E. J. Humphrey, J. P. Bello, and Y. LeCun, "Deep architectures and automatic feature learning in music informatics," in Proc. ISMIR, 2012, pp. 403-408.
    • (2012) Proc. ISMIR , pp. 403-408
    • Humphrey, E.J.1    Bello, J.P.2    Lecun, Y.3
  • 18
    • 71149119964 scopus 로고    scopus 로고
    • Online dictionary learning for sparse coding
    • J. Mairal, F. Bach, J. Ponce, and G. Sapiro, "Online dictionary learning for sparse coding," in Proc. ICML, 2009, pp. 689-696.
    • (2009) Proc. ICML , pp. 689-696
    • Mairal, J.1    Bach, F.2    Ponce, J.3    Sapiro, G.4
  • 19
    • 69349090197 scopus 로고    scopus 로고
    • Learning deep architectures for AI
    • Y. Bengio, "Learning deep architectures for AI," Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1-127, 2009.
    • (2009) Found. Trends Mach. Learn. , vol.2 , Issue.1 , pp. 1-127
    • Bengio, Y.1
  • 20
    • 34547645414 scopus 로고    scopus 로고
    • The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music
    • DOI 10.1121/1.2750160
    • J.-J. Aucouturier, B. Defreville, and F. Pachet, "The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music," J. Acoust. Soc. Amer., vol. 122, no. 2, pp. 881-891, 2007. (Pubitemid 47205542)
    • (2007) Journal of the Acoustical Society of America , vol.122 , Issue.2 , pp. 881-891
    • Aucouturier, J.-J.1    Defreville, B.2    Pachet, F.3
  • 21
    • 84864146684 scopus 로고    scopus 로고
    • Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
    • P. Hamel, S. Lemieux, Y. Bengio, and D. Eck, "Temporal pooling and multiscale learning for automatic annotation and ranking of music audio," in Proc. ISMIR, 2011, pp. 729-734.
    • (2011) Proc. ISMIR , pp. 729-734
    • Hamel, P.1    Lemieux, S.2    Bengio, Y.3    Eck, D.4
  • 24
    • 2942735564 scopus 로고    scopus 로고
    • A largescale evaluation of acoustic and subjective music similarity measures
    • A. Berenzweig, B. Logan, D. P. W. Ellis, and B. Whitman, "A largescale evaluation of acoustic and subjective music similarity measures," Comput. Music J., vol. 28, no. 2, pp. 63-76, 2004.
    • (2004) Comput. Music J. , vol.28 , Issue.2 , pp. 63-76
    • Berenzweig, A.1    Logan, B.2    Ellis, D.P.W.3    Whitman, B.4
  • 25
    • 0036648502 scopus 로고    scopus 로고
    • Musical genre classification of audio signals
    • DOI 10.1109/TSA.2002.800560, PII 1011092002800560
    • G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Trans. Speech Audio Process., vol. 10, no. 5, pp. 293-302, 2002. (Pubitemid 34950067)
    • (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.5 , pp. 293-302
    • Tzanetakis, G.1    Cook, P.2
  • 28
    • 77949578539 scopus 로고    scopus 로고
    • A text retrieval approach to content-based audio retrieval
    • M. Riley, E. Heinen, and J. Ghosh, "A text retrieval approach to content-based audio retrieval," in Proc. ISMIR, 2008, pp. 295-300.
    • (2008) Proc. ISMIR , pp. 295-300
    • Riley, M.1    Heinen, E.2    Ghosh, J.3
  • 30
    • 80052532611 scopus 로고    scopus 로고
    • Music classification via the bag-of-features approach
    • Z. Fu, G. Lu, K.-M. Ting, and D. Zhang, "Music classification via the bag-of-features approach," Pattern Recognit. Lett., vol. 32, pp. 1768-1777, 2011.
    • (2011) Pattern Recognit. Lett. , vol.32 , pp. 1768-1777
    • Fu, Z.1    Lu, G.2    Ting, K.-M.3    Zhang, D.4
  • 31
    • 84890516096 scopus 로고    scopus 로고
    • Dual-layer bag-of-frames model for music genre classification
    • C.-C. M. Yeh, L. Su, and Y.-H. Yang, "Dual-layer bag-of-frames model for music genre classification," in Proc. IEEE ICASSP, 2013.
    • (2013) Proc. IEEE ICASSP
    • Yeh, C.-C.M.1    Su, L.2    Yang, Y.-H.3
  • 34
    • 84873465585 scopus 로고    scopus 로고
    • Unsupervised learning of local features for music classification
    • J. Wülfing and M. Riedmiller, "Unsupervised learning of local features for music classification," in Proc. ISMIR, 2012, pp. 139-144.
    • (2012) Proc. ISMIR , pp. 139-144
    • Wülfing, J.1    Riedmiller, M.2
  • 36
    • 84873573953 scopus 로고    scopus 로고
    • Identifying repeated patterns in music using sparse convolutive non-negative matrix factorization
    • R. J. Weiss and J. P. Bello, "Identifying repeated patterns in music using sparse convolutive non-negative matrix factorization," in Proc. ISMIR, 2011, pp. 123-128.
    • (2011) Proc. ISMIR , pp. 123-128
    • Weiss, R.J.1    Bello, J.P.2
  • 37
    • 84873594763 scopus 로고    scopus 로고
    • Learning the similarity of audio music in bag-of-frames representation from tagged music data
    • J.-C. Wang, H.-S. Lee, H.-M. Wang, and S.-K. Jeng, "Learning the similarity of audio music in bag-of-frames representation from tagged music data," in Proc. ISMIR, 2011.
    • (2011) Proc. ISMIR
    • Wang, J.-C.1    Lee, H.-S.2    Wang, H.-M.3    Jeng, S.-K.4
  • 40
    • 33846217029 scopus 로고    scopus 로고
    • A supervised classification algorithm for note onset detection
    • A. Lacoste and D. Eck, "A supervised classification algorithm for note onset detection," EURASIP J. Adv. Signal Process., pp. 1-14, 2007.
    • (2007) EURASIP J. Adv. Signal Process. , pp. 1-14
    • Lacoste, A.1    Eck, D.2
  • 41
    • 84867602860 scopus 로고    scopus 로고
    • Learning a robust tonnetzspace transform for automatic chord recognition
    • E. J. Humphrey, T. Cho, and J. P. Bello, "Learning a robust tonnetzspace transform for automatic chord recognition," in Proc. IEEE ICASSP, 2012, pp. 453-456.
    • (2012) Proc. IEEE ICASSP , pp. 453-456
    • Humphrey, E.J.1    Cho, T.2    Bello, J.P.3
  • 42
    • 80053442434 scopus 로고    scopus 로고
    • The importance of encoding versus training with sparse coding and vector quantization
    • A. Coates and A. Ng, "The importance of encoding versus training with sparse coding and vector quantization," in Proc. ICML, 2011, pp. 921-928.
    • (2011) Proc. ICML , pp. 921-928
    • Coates, A.1    Ng, A.2
  • 44
    • 51949098112 scopus 로고    scopus 로고
    • Classification using intersection kernel support vector machines is efficient
    • S. Maji, A. Berg, and J. Malik, "Classification using intersection kernel support vector machines is efficient," in Proc. IEEE CVPR, 2008, pp. 1-8.
    • (2008) Proc. IEEE CVPR , pp. 1-8
    • Maji, S.1    Berg, A.2    Malik, J.3
  • 45
    • 77952717202 scopus 로고    scopus 로고
    • Sparse representation for computer vision and pattern recognition
    • J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. Huang, and S. Yan, "Sparse representation for computer vision and pattern recognition," Proc. IEEE, vol. 98, no. 6, pp. 1031-1044, 2010.
    • (2010) Proc. IEEE , vol.98 , Issue.6 , pp. 1031-1044
    • Wright, J.1    Ma, Y.2    Mairal, J.3    Sapiro, G.4    Huang, T.5    Yan, S.6
  • 46
    • 79960657803 scopus 로고    scopus 로고
    • Exemplar-based sparse representations for noise robust automatic speech recognition
    • Sep.
    • J. F. Gemmeke, T. Virtanen, and A. Hurmalainen, "Exemplar-based sparse representations for noise robust automatic speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2067-2080, Sep. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.7 , pp. 2067-2080
    • Gemmeke, J.F.1    Virtanen, T.2    Hurmalainen, A.3
  • 47
    • 77954583359 scopus 로고    scopus 로고
    • Web-scale k-means clustering
    • D. Sculley, "Web-scale k-means clustering," in Proc. ACM WWW, 2010, pp. 1177-1178.
    • (2010) Proc. ACM WWW , pp. 1177-1178
    • Sculley, D.1
  • 48
    • 33750383209 scopus 로고    scopus 로고
    • K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation
    • DOI 10.1109/TSP.2006.881199
    • M. Aharon, M. Elad, and A. Bruckstein, "K-svd: An algorithm for designing overcomplete dictionaries for sparse representation," IEEE Trans. Signal Process., vol. 54, no. 11, pp. 4311-4322, 2006. (Pubitemid 44637761)
    • (2006) IEEE Transactions on Signal Processing , vol.54 , Issue.11 , pp. 4311-4322
    • Aharon, M.1    Elad, M.2    Bruckstein, A.3
  • 50
    • 0001287271 scopus 로고    scopus 로고
    • Regression shrinkage and selection via the lasso
    • R. Tibshirani, "Regression shrinkage and selection via the lasso," J. Royal Statist. Soc., vol. 58, pp. 267-288, 1996.
    • (1996) J. Royal Statist. Soc. , vol.58 , pp. 267-288
    • Tibshirani, R.1
  • 51
    • 84864250650 scopus 로고    scopus 로고
    • An analysis of single-layer networks in unsupervised feature learning
    • A. Coates, H. Lee, and A. Ng, "An analysis of single-layer networks in unsupervised feature learning," in Proc. AISTATS, 2011.
    • (2011) Proc. AISTATS
    • Coates, A.1    Lee, H.2    Ng, A.3
  • 53
    • 70450209196 scopus 로고    scopus 로고
    • Linear spatial pyramid matching using sparse coding for image classification
    • J. Yang, K. Yu, Y. Gong, and T. Huang, "Linear spatial pyramid matching using sparse coding for image classification," in Proc. IEEE CVPR, 2009, pp. 1794-1801.
    • (2009) Proc. IEEE CVPR , pp. 1794-1801
    • Yang, J.1    Yu, K.2    Gong, Y.3    Huang, T.4
  • 54
    • 84864146684 scopus 로고    scopus 로고
    • Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
    • P. Hamel, S. Lemieux, Y. Bengio, and D. Eck, "Temporal pooling and multiscale learning for automatic annotation and ranking of music audio," in Proc. ISMIR, 2011, pp. 729-734.
    • (2011) Proc. ISMIR , pp. 729-734
    • Hamel, P.1    Lemieux, S.2    Bengio, Y.3    Eck, D.4
  • 55
    • 77956502203 scopus 로고    scopus 로고
    • A theoretical analysis of feature pooling in visual recognition
    • Y.-L. Boureau, J. Ponce, and Y. Lecun, "A theoretical analysis of feature pooling in visual recognition," in Proc. ICML, 2010.
    • (2010) Proc. ICML
    • Boureau, Y.-L.1    Ponce, J.2    Lecun, Y.3
  • 56
    • 8844253324 scopus 로고    scopus 로고
    • Understanding inverse document frequency: On theoretical arguments for IDF
    • DOI 10.1108/00220410410560582
    • S. Robertson, "Understanding inverse document frequency: On theoretical arguments for IDF," J. Document., vol. 60, no. 5, pp. 503-520, 2004. (Pubitemid 39538229)
    • (2004) Journal of Documentation , vol.60 , Issue.5 , pp. 503-520
    • Robertson, S.1
  • 57
    • 80051515057 scopus 로고    scopus 로고
    • Exploring the music similarity space on the web
    • M. Schedl, T. Pohle, P. Knees, and G. Widmer, "Exploring the music similarity space on the web," ACM Trans. Inf. Syst., vol. 29, no. 3, pp. 14:1-14:24, 2011.
    • (2011) ACM Trans. Inf. Syst. , vol.29 , Issue.3 , pp. 141-1424
    • Schedl, M.1    Pohle, T.2    Knees, P.3    Widmer, G.4
  • 59
    • 62249199807 scopus 로고    scopus 로고
    • Supervised and traditional term weighting methods for automatic text categorization
    • M. Lan, C. L. Tan, J. Su, and Y. Lu, "Supervised and traditional term weighting methods for automatic text categorization," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, pp. 721-735, 2009.
    • (2009) IEEE Trans. Pattern Anal. Mach. Intell. , vol.31 , pp. 721-735
    • Lan, M.1    Tan, C.L.2    Su, J.3    Lu, Y.4
  • 63
    • 84873447086 scopus 로고    scopus 로고
    • Multivariate autoregressive mixture models for music auto-tagging
    • E. Coviello, Y. Vaizman, A. B. Chan, and G. R. Lanckriet, "Multivariate autoregressive mixture models for music auto-tagging," in Proc. ISMIR, 2012, pp. 547-552.
    • (2012) Proc. ISMIR , pp. 547-552
    • Coviello, E.1    Vaizman, Y.2    Chan, A.B.3    Lanckriet, G.R.4
  • 64
    • 84859036701 scopus 로고    scopus 로고
    • Zipf's law in short-time timbral codings of speech, music, and environmental sound signals
    • M. Haro, P. H. Joan Serrà, and Á Corral, "Zipf's law in short-time timbral codings of speech, music, and environmental sound signals," PLoS ONE, vol. 7, no. 3, 2012, e33993.
    • (2012) PLoS ONE , vol.7 , Issue.3
    • Haro, M.1    Joan Serra, P.H.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.