메뉴 건너뛰기




Volumn , Issue , 2012, Pages

Supervised dictionary learning for music genre classification

Author keywords

Dictionary learning; Genre classification; Sparse coding

Indexed keywords

BENCHMARK DATASETS; CLASSIFICATION ACCURACY; CODEBOOK GENERATION; CODEBOOKS; CODEWORD; DICTIONARY LEARNING; GENRE CLASSIFICATION; LABELED DATA; LOCAL FEATURE; LOCAL FEATURE DESCRIPTOR; LOCAL FEATURE EXTRACTION; MUSIC GENRE CLASSIFICATION; OPTIMAL SETTING; PERFORMANCE STUDY; SPARSE CODING; SPECTROGRAMS; TIME-VARYING INFORMATION;

EID: 84864120028     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2324796.2324859     Document Type: Conference Paper
Times cited : (25)

References (56)
  • 2
    • 33751531805 scopus 로고    scopus 로고
    • Aggregate features and ADABOOST for music classification
    • DOI 10.1007/s10994-006-9019-7, Special Issue on Machine Learning in and for Music
    • J. Bergstra and B. Kegl. Aggregate features and adaboost for music classification. In Machine Learning, volume 65, pages 473-484, 2006. (Pubitemid 44836054)
    • (2006) Machine Learning , vol.65 , Issue.2-3 , pp. 473-484
    • Bergstra, J.1    Casagrande, N.2    Erhan, D.3    Eck, D.4    Kegl, B.5
  • 3
    • 84873599467 scopus 로고    scopus 로고
    • Scalable genre and tag prediction with spectral covariance
    • J. Bergstra, M. I. Mandel, and D. Eck. Scalable genre and tag prediction with spectral covariance. In ISMIR, pages 507-512, 2010.
    • (2010) ISMIR , pp. 507-512
    • Bergstra, J.1    Mandel, M.I.2    Eck, D.3
  • 4
    • 76949083547 scopus 로고    scopus 로고
    • Enforcing harmonicity and smoothness in bayesian non-negative matrix factorization applied to polyphonic music transcription
    • N. Bertin, R. Badeau, and E. Vincent. Enforcing harmonicity and smoothness in bayesian non-negative matrix factorization applied to polyphonic music transcription. IEEE Trans. Audio, Speech and Lang. Processing, 18:538-549, 2010.
    • (2010) IEEE Trans. Audio, Speech and Lang. Processing , vol.18 , pp. 538-549
    • Bertin, N.1    Badeau, R.2    Vincent, E.3
  • 5
    • 84864131717 scopus 로고    scopus 로고
    • Large-scale cover song recognition using hashed chroma landmarks
    • T. Bertin-Mahieux and D. Ellis. Large-scale cover song recognition using hashed chroma landmarks. In IEEE WASPAA, 2011.
    • (2011) IEEE WASPAA
    • Bertin-Mahieux, T.1    Ellis, D.2
  • 6
    • 64649105397 scopus 로고    scopus 로고
    • Content-based music information retrieval: Current directions and future challenges
    • M. A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, and M. Slaney. Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4):668-696, 2008.
    • (2008) Proceedings of the IEEE , vol.96 , Issue.4 , pp. 668-696
    • Casey, M.A.1    Veltkamp, R.2    Goto, M.3    Leman, M.4    Rhodes, C.5    Slaney, M.6
  • 7
    • 84873607146 scopus 로고    scopus 로고
    • Music genre classification via compressive sampling
    • K. K. Chang, J.-S. R. Jang, and C. S. Iliopoulos. Music genre classification via compressive sampling. In ISMIR, pages 387-392, 2010.
    • (2010) ISMIR , pp. 387-392
    • Chang, K.K.1    Jang, J.-S.R.2    Iliopoulos, C.S.3
  • 10
    • 79952972450 scopus 로고    scopus 로고
    • A survey of audio-based music classification and annotation
    • Z. Fu, G. Lu, K. M. Ting, and D. Zhang. A survey of audio-based music classification and annotation. IEEE Trans. Multimedia, 13(99):303-319, 2011.
    • (2011) IEEE Trans. Multimedia , vol.13 , Issue.99 , pp. 303-319
    • Fu, Z.1    Lu, G.2    Ting, K.M.3    Zhang, D.4
  • 11
  • 12
    • 34249013658 scopus 로고    scopus 로고
    • Relationships between musical structure and psychophysiological measures of emotion
    • DOI 10.1037/1528-3542.7.2.377
    • P. Gomez and B. Danuser. Relationships between musical structure and psychophysiological measures of emotion. Emotion, 7(2):377-87, 2007. (Pubitemid 46800378)
    • (2007) Emotion , vol.7 , Issue.2 , pp. 377-387
    • Gomez, P.1    Danuser, B.2
  • 13
    • 84873584268 scopus 로고    scopus 로고
    • Learning features from music audio with deep belief networks
    • P. Hamel and D. Eck. Learning features from music audio with deep belief networks. In ISMIR, pages 339-344, 2010.
    • (2010) ISMIR , pp. 339-344
    • Hamel, P.1    Eck, D.2
  • 14
    • 84864146684 scopus 로고    scopus 로고
    • Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
    • P. Hamel, S. Lemieux, Y. Bengio, and D. Eck. Temporal pooling and multiscale learning for automatic annotation and ranking of music audio. In ISMIR, pages 729-734, 2011.
    • (2011) ISMIR , pp. 729-734
    • Hamel, P.1    Lemieux, S.2    Bengio, Y.3    Eck, D.4
  • 15
    • 84860189111 scopus 로고    scopus 로고
    • Simple and practical algorithm for sparse Fourier transform
    • H. Hassanieh, P. Indyk, D. Katabi, and E. Price. Simple and practical algorithm for sparse Fourier transform. In SODA, 2012.
    • (2012) SODA
    • Hassanieh, H.1    Indyk, P.2    Katabi, D.3    Price, E.4
  • 16
    • 84864122549 scopus 로고    scopus 로고
    • Unsupervised learning of sparse features for scalable audio classification
    • M. Henaff, K. Jarrett, K. Kavukcuoglu, and Y. LeCun. Unsupervised learning of sparse features for scalable audio classification. In ISMIR, pages 681-686, 2011.
    • (2011) ISMIR , pp. 681-686
    • Henaff, M.1    Jarrett, K.2    Kavukcuoglu, K.3    Lecun, Y.4
  • 17
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • DOI 10.1162/neco.2006.18.7.1527
    • G. E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Comp., 18(7):1527-1554, 2006. (Pubitemid 44024729)
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.-W.3
  • 18
    • 39649092019 scopus 로고    scopus 로고
    • Musical genre classification using nonnegative matrix factorization-based features
    • A. Holzapfel and Y. Stylianou. Musical genre classification using nonnegative matrix factorization-based features. IEEE Trans. Audio, Speech and Lang. Processing, 16(2):424-434, 2008.
    • (2008) IEEE Trans. Audio, Speech and Lang. Processing , vol.16 , Issue.2 , pp. 424-434
    • Holzapfel, A.1    Stylianou, Y.2
  • 20
    • 77950110253 scopus 로고    scopus 로고
    • Acoustic topic model for audio information retrieval
    • S. Kim, S. Narayanan, and S. Sundaram. Acoustic topic model for audio information retrieval. In IEEE WASPAA, pages 37-40, 2009.
    • (2009) IEEE WASPAA , pp. 37-40
    • Kim, S.1    Narayanan, S.2    Sundaram, S.3
  • 21
    • 84861135715 scopus 로고    scopus 로고
    • Multipitch estimation of piano music by exemplar-based sparse representation
    • to appear
    • C.-T. Lee, Y.-H. Yang, and H. H. Chen. Multipitch estimation of piano music by exemplar-based sparse representation. IEEE Trans. Multimedia, 2012. to appear.
    • (2012) IEEE Trans. Multimedia
    • Lee, C.-T.1    Yang, Y.-H.2    Chen, H.H.3
  • 22
    • 63049114780 scopus 로고    scopus 로고
    • Music information retrieval using social tags and audio
    • M. Levy and M. Sandler. Music information retrieval using social tags and audio. IEEE Trans. Multimedia, 11:383-395, 2009.
    • (2009) IEEE Trans. Multimedia , vol.11 , pp. 383-395
    • Levy, M.1    Sandler, M.2
  • 25
    • 84855897345 scopus 로고    scopus 로고
    • Combining audio and symbolic descriptors for music classification from audio
    • T. Lidy, A. Rauber, A. Pertusa, and J. M. Iñesta. Combining audio and symbolic descriptors for music classification from audio. In ISMIR, 2007.
    • (2007) ISMIR
    • Lidy, T.1    Rauber, A.2    Pertusa, A.3    Iñesta, J.M.4
  • 26
    • 3042535216 scopus 로고    scopus 로고
    • Distinctive image features from scale-invariant keypoints
    • D. G. Lowe. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision, 60:91-110, 2004.
    • (2004) Int. J. Comput. Vision , vol.60 , pp. 91-110
    • Lowe, D.G.1
  • 27
    • 85008010045 scopus 로고    scopus 로고
    • Audio keywords discovery for text-like audio content analysis and retrieval
    • L. Lu and A. Hanjalic. Audio keywords discovery for text-like audio content analysis and retrieval. IEEE Trans. Multimedia, 10(1):74-85, 2008.
    • (2008) IEEE Trans. Multimedia , vol.10 , Issue.1 , pp. 74-85
    • Lu, L.1    Hanjalic, A.2
  • 29
    • 70350618702 scopus 로고    scopus 로고
    • Classification using intersection kernel support vector machines is efficient
    • S. Maji, A. Berg, and J. Malik. Classification using intersection kernel support vector machines is efficient. In IEEE CVPR, pages 1-8, 2008.
    • (2008) IEEE CVPR , pp. 1-8
    • Maji, S.1    Berg, A.2    Malik, J.3
  • 30
    • 80053156439 scopus 로고    scopus 로고
    • Learning tags that vary within a song
    • M. I. Mandel, D. Eck, and Y. Bengio. Learning tags that vary within a song. In ISMIR, pages 399-404, 2010.
    • (2010) ISMIR , pp. 399-404
    • Mandel, M.I.1    Eck, D.2    Bengio, Y.3
  • 31
    • 84864115627 scopus 로고    scopus 로고
    • Learning content similarity for music recommendation
    • abs/1105.2344
    • B. McFee, L. Barrington, and G. R. G. Lanckriet. Learning content similarity for music recommendation. CoRR, abs/1105.2344, 2011.
    • (2011) CoRR
    • McFee, B.1    Barrington, L.2    Lanckriet, G.R.G.3
  • 34
    • 23944456976 scopus 로고    scopus 로고
    • Exploring music collections by browsing different views
    • E. Pampalk, S. Dixon, and G. Widmer. Exploring music collections by browsing different views. In ISMIR, 2003.
    • (2003) ISMIR
    • Pampalk, E.1    Dixon, S.2    Widmer, G.3
  • 35
    • 84873545114 scopus 로고    scopus 로고
    • Improvements of audio-based music similarity and genre classification
    • E. Pampalk, A. Flexer, and G. Widmer. Improvements of audio-based music similarity and genre classification. In ISMIR, pages 628-633, 2005.
    • (2005) ISMIR , pp. 628-633
    • Pampalk, E.1    Flexer, A.2    Widmer, G.3
  • 37
    • 84873668627 scopus 로고    scopus 로고
    • Music genre classification using locality preserving non-negative tensor factorization and sparse representations
    • Y. Panagakis, C. Kotropoulos, and G. R. Arce. Music genre classification using locality preserving non-negative tensor factorization and sparse representations. In ISMIR, pages 249-254, 2009.
    • (2009) ISMIR , pp. 249-254
    • Panagakis, Y.1    Kotropoulos, C.2    Arce, G.R.3
  • 38
    • 76949097407 scopus 로고    scopus 로고
    • Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification
    • Y. Panagakis, C. Kotropoulos, and G. R. Arce. Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification. IEEE Trans. Audio, Speech, and Lang. Processing, 18(3):576-588, 2010.
    • (2010) IEEE Trans. Audio, Speech, and Lang. Processing , vol.18 , Issue.3 , pp. 576-588
    • Panagakis, Y.1    Kotropoulos, C.2    Arce, G.R.3
  • 39
    • 84863552197 scopus 로고    scopus 로고
    • State of the art report: Audio-based music structure analysis
    • J. Paulus, M. Müller, and A. Klapuri. State of the art report: Audio-based music structure analysis. In ISMIR, pages 625-636, 2010.
    • (2010) ISMIR , pp. 625-636
    • Paulus, J.1    Müller, M.2    Klapuri, A.3
  • 41
    • 77949578539 scopus 로고    scopus 로고
    • A text retrieval approach to content-based audio retrieval
    • M. Riley, E. Heinen, and J. Ghosh. A text retrieval approach to content-based audio retrieval. In ISMIR, 2008.
    • (2008) ISMIR
    • Riley, M.1    Heinen, E.2    Ghosh, J.3
  • 42
    • 80052120713 scopus 로고    scopus 로고
    • Enhancing multi-label music genre classification through ensemble techniques
    • C. Sanden and J. Z. Zhang. Enhancing multi-label music genre classification through ensemble techniques. In SIGIR, pages 705-714, 2011.
    • (2011) SIGIR , pp. 705-714
    • Sanden, C.1    Zhang, J.Z.2
  • 45
    • 77954583359 scopus 로고    scopus 로고
    • Web-scale k-means clustering
    • D. Sculley. Web-scale k-means clustering. In WWW, pages 1177-1178, 2010.
    • (2010) WWW , pp. 1177-1178
    • Sculley, D.1
  • 46
    • 84856929813 scopus 로고    scopus 로고
    • An integrated approach to music boundary detection
    • M.-Y. Su, Y.-H. Yang, Y.-C. Lin, and H.-H. Chen. An integrated approach to music boundary detection. In ISMIR, pages 705-710, 2009.
    • (2009) ISMIR , pp. 705-710
    • Su, M.-Y.1    Yang, Y.-H.2    Lin, Y.-C.3    Chen, H.-H.4
  • 47
    • 0001287271 scopus 로고    scopus 로고
    • Regression shrinkage and selection via the lasso
    • R. Tibshirani. Regression shrinkage and selection via the lasso. J. Royal Statistical Soc., 58:267-288, 1996.
    • (1996) J. Royal Statistical Soc. , vol.58 , pp. 267-288
    • Tibshirani, R.1
  • 49
    • 0036648502 scopus 로고    scopus 로고
    • Musical genre classification of audio signals
    • DOI 10.1109/TSA.2002.800560, PII 1011092002800560
    • G. Tzanetakis and P. Cook. Musical genre classification of audio signals. IEEE Trans. Speech and Audio Processing, 10(5):293-302, 2002. (Pubitemid 34950067)
    • (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.5 , pp. 293-302
    • Tzanetakis, G.1    Cook, P.2
  • 50
    • 84873594763 scopus 로고    scopus 로고
    • Learning the similarity of audio music in bag-of-frames representation from tagged music data
    • J.-C. Wang, H.-S. Lee, H.-M. Wang, and S.-K. Jeng. Learning the similarity of audio music in bag-of-frames representation from tagged music data. In ISMIR, 2011.
    • (2011) ISMIR
    • Wang, J.-C.1    Lee, H.-S.2    Wang, H.-M.3    Jeng, S.-K.4
  • 51
    • 84864146689 scopus 로고    scopus 로고
    • Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval
    • to appear
    • J. Weston, S. Bengio, and P. Hamel. Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval. J. New Music Res., 2012. to appear.
    • (2012) J. New Music Res.
    • Weston, J.1    Bengio, S.2    Hamel, P.3
  • 52
    • 77952717202 scopus 로고    scopus 로고
    • Sparse representation for computer vision and pattern recognition
    • J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. Huang, and S. Yan. Sparse representation for computer vision and pattern recognition. Proceedings of the IEEE, 98(6):1031-1044, 2010.
    • (2010) Proceedings of the IEEE , vol.98 , Issue.6 , pp. 1031-1044
    • Wright, J.1    Ma, Y.2    Mairal, J.3    Sapiro, G.4    Huang, T.5    Yan, S.6
  • 53
    • 37849036011 scopus 로고    scopus 로고
    • Evaluating bag-of-visual-words representations in scene classification
    • J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo. Evaluating bag-of-visual-words representations in scene classification. In MIR, pages 197-206, 2007.
    • (2007) MIR , pp. 197-206
    • Yang, J.1    Jiang, Y.-G.2    Hauptmann, A.G.3    Ngo, C.-W.4
  • 54
    • 70450209196 scopus 로고    scopus 로고
    • Linear spatial pyramid matching using sparse coding for image classification
    • J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In IEEE CVPR, pages 1794-1801, 2009.
    • (2009) IEEE CVPR , pp. 1794-1801
    • Yang, J.1    Yu, K.2    Gong, Y.3    Huang, T.4
  • 56
    • 84864115629 scopus 로고    scopus 로고
    • Online detection of unusual events in videos via dynamic sparse coding
    • B. Zhao, L. Fei-Fei, and E. P. Xing. Online detection of unusual events in videos via dynamic sparse coding. In IEEE CVPR, 2011.
    • (2011) IEEE CVPR
    • Zhao, B.1    Fei-Fei, L.2    Xing, E.P.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.