SCOPUS 정보 검색 플랫폼

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR 2012

Volumn , Issue , 2012, Pages

Supervised dictionary learning for music genre classification

(2) Yeh, Chin Chia Michael a Yang, Yi Hsuan a

a RESEARCH CENTER FOR INFORMATION TECHNOLOGY INNOVATION (Taiwan)

Author keywords

Dictionary learning; Genre classification; Sparse coding

Indexed keywords

BENCHMARK DATASETS; CLASSIFICATION ACCURACY; CODEBOOK GENERATION; CODEBOOKS; CODEWORD; DICTIONARY LEARNING; GENRE CLASSIFICATION; LABELED DATA; LOCAL FEATURE; LOCAL FEATURE DESCRIPTOR; LOCAL FEATURE EXTRACTION; MUSIC GENRE CLASSIFICATION; OPTIMAL SETTING; PERFORMANCE STUDY; SPARSE CODING; SPECTROGRAMS; TIME-VARYING INFORMATION;

ENCODING (SYMBOLS); FEATURE EXTRACTION; LEARNING ALGORITHMS;

CLASSIFICATION (OF INFORMATION);

EID: 84864120028 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2324796.2324859 Document Type: Conference Paper

Times cited : (25)

References (56)

1
- 84863752375
- A tensor-based approach for automatic music genre classification
- E. Benetos and C. Kotropoulos. A tensor-based approach for automatic music genre classification. In European Conf. Signal Processing, 2008.
- (2008) European Conf. Signal Processing
- Benetos, E.¹ Kotropoulos, C.²

2
- 33751531805
- Aggregate features and ADABOOST for music classification
- DOI 10.1007/s10994-006-9019-7, Special Issue on Machine Learning in and for Music
- J. Bergstra and B. Kegl. Aggregate features and adaboost for music classification. In Machine Learning, volume 65, pages 473-484, 2006. (Pubitemid 44836054)
- (2006) Machine Learning , vol.65 , Issue.2-3 , pp. 473-484
- Bergstra, J.¹ Casagrande, N.² Erhan, D.³ Eck, D.⁴ Kegl, B.⁵

3
- 84873599467
- Scalable genre and tag prediction with spectral covariance
- J. Bergstra, M. I. Mandel, and D. Eck. Scalable genre and tag prediction with spectral covariance. In ISMIR, pages 507-512, 2010.
- (2010) ISMIR , pp. 507-512
- Bergstra, J.¹ Mandel, M.I.² Eck, D.³

4
- 76949083547
- Enforcing harmonicity and smoothness in bayesian non-negative matrix factorization applied to polyphonic music transcription
- N. Bertin, R. Badeau, and E. Vincent. Enforcing harmonicity and smoothness in bayesian non-negative matrix factorization applied to polyphonic music transcription. IEEE Trans. Audio, Speech and Lang. Processing, 18:538-549, 2010.
- (2010) IEEE Trans. Audio, Speech and Lang. Processing , vol.18 , pp. 538-549
- Bertin, N.¹ Badeau, R.² Vincent, E.³

5
- 84864131717
- Large-scale cover song recognition using hashed chroma landmarks
- T. Bertin-Mahieux and D. Ellis. Large-scale cover song recognition using hashed chroma landmarks. In IEEE WASPAA, 2011.
- (2011) IEEE WASPAA
- Bertin-Mahieux, T.¹ Ellis, D.²

6
- 64649105397
- Content-based music information retrieval: Current directions and future challenges
- M. A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, and M. Slaney. Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4):668-696, 2008.
- (2008) Proceedings of the IEEE , vol.96 , Issue.4 , pp. 668-696
- Casey, M.A.¹ Veltkamp, R.² Goto, M.³ Leman, M.⁴ Rhodes, C.⁵ Slaney, M.⁶

7
- 84873607146
- Music genre classification via compressive sampling
- K. K. Chang, J.-S. R. Jang, and C. S. Iliopoulos. Music genre classification via compressive sampling. In ISMIR, pages 387-392, 2010.
- (2010) ISMIR , pp. 387-392
- Chang, K.K.¹ Jang, J.-S.R.² Iliopoulos, C.S.³

8
- 0032131292
- Atomic decomposition by basis pursuit
- PII S1064827596304010
- S. S. Chen, D. L. Donoho, Michael, and A. Saunders. Atomic decomposition by basis pursuit. SIAM J. Scientific Computing, 20:33-61, 1998. (Pubitemid 128689501)
- (1998) SIAM Journal of Scientific Computing , vol.20 , Issue.1 , pp. 33-61
- Chen, S.S.¹ Donoho, D.L.² Saunders, M.A.³

9
- 3242708140
- Least angle regression
- DOI 10.1214/009053604000000067
- B. Efron, T. Hastie, L. Johnstone, and R. Tibshirani. Least angle regression. Annals of Statistics, 32:407-499, 2004. (Pubitemid 41250302)
- (2004) Annals of Statistics , vol.32 , Issue.2 , pp. 407-499
- Efron, B.¹ Hastie, T.² Johnstone, I.³ Tibshirani, R.⁴ Ishwaran, H.⁵ Knight, K.⁶ Loubes, J.-M.⁷ Massart, P.⁸ Madigan, D.⁹ Ridgeway, G.¹⁰ Rosset, S.¹¹ Zhu, J.I.¹² Stine, R.A.¹³ Turlach, B.A.¹⁴ Weisberg, S.¹⁵ Hastie, T.¹⁶ Johnstone, I.¹⁷ Tibshirani, R.¹⁸

10
- 79952972450
- A survey of audio-based music classification and annotation
- Z. Fu, G. Lu, K. M. Ting, and D. Zhang. A survey of audio-based music classification and annotation. IEEE Trans. Multimedia, 13(99):303-319, 2011.
- (2011) IEEE Trans. Multimedia , vol.13 , Issue.99 , pp. 303-319
- Fu, Z.¹ Lu, G.² Ting, K.M.³ Zhang, D.⁴

11
- 79960657803
- Exemplar-based sparse representations for noise robust automatic speech recognition
- J. F. Gemmeke, T. Virtanen, and A. Hurmalainen. Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE Trans. Audio, Speech and Lang. Processing, 19(7):2067-2080, 2011.
- (2011) IEEE Trans. Audio, Speech and Lang. Processing , vol.19 , Issue.7 , pp. 2067-2080
- Gemmeke, J.F.¹ Virtanen, T.² Hurmalainen, A.³

12
- 34249013658
- Relationships between musical structure and psychophysiological measures of emotion
- DOI 10.1037/1528-3542.7.2.377
- P. Gomez and B. Danuser. Relationships between musical structure and psychophysiological measures of emotion. Emotion, 7(2):377-87, 2007. (Pubitemid 46800378)
- (2007) Emotion , vol.7 , Issue.2 , pp. 377-387
- Gomez, P.¹ Danuser, B.²

13
- 84873584268
- Learning features from music audio with deep belief networks
- P. Hamel and D. Eck. Learning features from music audio with deep belief networks. In ISMIR, pages 339-344, 2010.
- (2010) ISMIR , pp. 339-344
- Hamel, P.¹ Eck, D.²

14
- 84864146684
- Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
- P. Hamel, S. Lemieux, Y. Bengio, and D. Eck. Temporal pooling and multiscale learning for automatic annotation and ranking of music audio. In ISMIR, pages 729-734, 2011.
- (2011) ISMIR , pp. 729-734
- Hamel, P.¹ Lemieux, S.² Bengio, Y.³ Eck, D.⁴

15
- 84860189111
- Simple and practical algorithm for sparse Fourier transform
- H. Hassanieh, P. Indyk, D. Katabi, and E. Price. Simple and practical algorithm for sparse Fourier transform. In SODA, 2012.
- (2012) SODA
- Hassanieh, H.¹ Indyk, P.² Katabi, D.³ Price, E.⁴

16
- 84864122549
- Unsupervised learning of sparse features for scalable audio classification
- M. Henaff, K. Jarrett, K. Kavukcuoglu, and Y. LeCun. Unsupervised learning of sparse features for scalable audio classification. In ISMIR, pages 681-686, 2011.
- (2011) ISMIR , pp. 681-686
- Henaff, M.¹ Jarrett, K.² Kavukcuoglu, K.³ Lecun, Y.⁴

17
- 33745805403
- A fast learning algorithm for deep belief nets
- DOI 10.1162/neco.2006.18.7.1527
- G. E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Comp., 18(7):1527-1554, 2006. (Pubitemid 44024729)
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.-W.³

18
- 39649092019
- Musical genre classification using nonnegative matrix factorization-based features
- A. Holzapfel and Y. Stylianou. Musical genre classification using nonnegative matrix factorization-based features. IEEE Trans. Audio, Speech and Lang. Processing, 16(2):424-434, 2008.
- (2008) IEEE Trans. Audio, Speech and Lang. Processing , vol.16 , Issue.2 , pp. 424-434
- Holzapfel, A.¹ Stylianou, Y.²

19
- 80052234083
- Proximal methods for hierarchical sparse coding
- R. Jenatton, J. Mairal, G. Obozinski, and F. Bach. Proximal methods for hierarchical sparse coding. J. Machine Learning Research, 2011.
- (2011) J. Machine Learning Research
- Jenatton, R.¹ Mairal, J.² Obozinski, G.³ Bach, F.⁴

20
- 77950110253
- Acoustic topic model for audio information retrieval
- S. Kim, S. Narayanan, and S. Sundaram. Acoustic topic model for audio information retrieval. In IEEE WASPAA, pages 37-40, 2009.
- (2009) IEEE WASPAA , pp. 37-40
- Kim, S.¹ Narayanan, S.² Sundaram, S.³

21
- 84861135715
- Multipitch estimation of piano music by exemplar-based sparse representation
- to appear
- C.-T. Lee, Y.-H. Yang, and H. H. Chen. Multipitch estimation of piano music by exemplar-based sparse representation. IEEE Trans. Multimedia, 2012. to appear.
- (2012) IEEE Trans. Multimedia
- Lee, C.-T.¹ Yang, Y.-H.² Chen, H.H.³

22
- 63049114780
- Music information retrieval using social tags and audio
- M. Levy and M. Sandler. Music information retrieval using social tags and audio. IEEE Trans. Multimedia, 11:383-395, 2009.
- (2009) IEEE Trans. Multimedia , vol.11 , pp. 383-395
- Levy, M.¹ Sandler, M.²

23
- 1542439119
- A comparative study on content-based music genre classification
- New York, NY, USA. ACM
- T. Li, M. Ogihara, and Q. Li. A comparative study on content-based music genre classification. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, SIGIR '03, pages 282-289, New York, NY, USA, 2003. ACM.
- (2003) Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR '03 , pp. 282-289
- Li, T.¹ Ogihara, M.² Li, Q.³

24
- 84864115628
- T. Lidy and A. Rauber. Evaluation of feature extractors and psycho-acoustic transformations for music genre classification, 2005.
- (2005) Evaluation of Feature Extractors and Psycho-acoustic Transformations for Music Genre Classification
- Lidy, T.¹ Rauber, A.²

25
- 84855897345
- Combining audio and symbolic descriptors for music classification from audio
- T. Lidy, A. Rauber, A. Pertusa, and J. M. Iñesta. Combining audio and symbolic descriptors for music classification from audio. In ISMIR, 2007.
- (2007) ISMIR
- Lidy, T.¹ Rauber, A.² Pertusa, A.³ Iñesta, J.M.⁴

26
- 3042535216
- Distinctive image features from scale-invariant keypoints
- D. G. Lowe. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision, 60:91-110, 2004.
- (2004) Int. J. Comput. Vision , vol.60 , pp. 91-110
- Lowe, D.G.¹

27
- 85008010045
- Audio keywords discovery for text-like audio content analysis and retrieval
- L. Lu and A. Hanjalic. Audio keywords discovery for text-like audio content analysis and retrieval. IEEE Trans. Multimedia, 10(1):74-85, 2008.
- (2008) IEEE Trans. Multimedia , vol.10 , Issue.1 , pp. 74-85
- Lu, L.¹ Hanjalic, A.²

28
- 71149119964
- Online dictionary learning for sparse coding
- J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In Int. Conf. Machine Learning, pages 689-696, 2009.
- (2009) Int. Conf. Machine Learning , pp. 689-696
- Mairal, J.¹ Bach, F.² Ponce, J.³ Sapiro, G.⁴

29
- 70350618702
- Classification using intersection kernel support vector machines is efficient
- S. Maji, A. Berg, and J. Malik. Classification using intersection kernel support vector machines is efficient. In IEEE CVPR, pages 1-8, 2008.
- (2008) IEEE CVPR , pp. 1-8
- Maji, S.¹ Berg, A.² Malik, J.³

30
- 80053156439
- Learning tags that vary within a song
- M. I. Mandel, D. Eck, and Y. Bengio. Learning tags that vary within a song. In ISMIR, pages 399-404, 2010.
- (2010) ISMIR , pp. 399-404
- Mandel, M.I.¹ Eck, D.² Bengio, Y.³

31
- 84864115627
- Learning content similarity for music recommendation
- abs/1105.2344
- B. McFee, L. Barrington, and G. R. G. Lanckriet. Learning content similarity for music recommendation. CoRR, abs/1105.2344, 2011.
- (2011) CoRR
- McFee, B.¹ Barrington, L.² Lanckriet, G.R.G.³

32
- 49549085544
- Temporal feature integration for music genre classification
- A. Meng, P. Ahrendt, J. Larsen, and L. K. Hansen. Temporal feature integration for music genre classification. IEEE Trans. Audio, Speech and Lang. Processing, 15(5):1654-1664, 2007.
- (2007) IEEE Trans. Audio, Speech and Lang. Processing , vol.15 , Issue.5 , pp. 1654-1664
- Meng, A.¹ Ahrendt, P.² Larsen, J.³ Hansen, L.K.⁴

33
- 80052985466
- Signal processing for music analysis
- M. Müller, D. P. W. Ellis, A. Klapuri, and G. Richard. Signal processing for music analysis. J. Sel. Topics Signal Processing, 5(6):1088-1110, 2011.
- (2011) J. Sel. Topics Signal Processing , vol.5 , Issue.6 , pp. 1088-1110
- Müller, M.¹ Ellis, D.P.W.² Klapuri, A.³ Richard, G.⁴

34
- 23944456976
- Exploring music collections by browsing different views
- E. Pampalk, S. Dixon, and G. Widmer. Exploring music collections by browsing different views. In ISMIR, 2003.
- (2003) ISMIR
- Pampalk, E.¹ Dixon, S.² Widmer, G.³

35
- 84873545114
- Improvements of audio-based music similarity and genre classification
- E. Pampalk, A. Flexer, and G. Widmer. Improvements of audio-based music similarity and genre classification. In ISMIR, pages 628-633, 2005.
- (2005) ISMIR , pp. 628-633
- Pampalk, E.¹ Flexer, A.² Widmer, G.³

36
- 67349110496
- Citeseer
- I. Panagakis, E. Benetos, and C. Kotropoulos. Music genre classification: A multilinear approach, pages 583-588. Citeseer, 2008.
- (2008) Music Genre Classification: A Multilinear Approach , pp. 583-588
- Panagakis, I.¹ Benetos, E.² Kotropoulos, C.³

37
- 84873668627
- Music genre classification using locality preserving non-negative tensor factorization and sparse representations
- Y. Panagakis, C. Kotropoulos, and G. R. Arce. Music genre classification using locality preserving non-negative tensor factorization and sparse representations. In ISMIR, pages 249-254, 2009.
- (2009) ISMIR , pp. 249-254
- Panagakis, Y.¹ Kotropoulos, C.² Arce, G.R.³

38
- 76949097407
- Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification
- Y. Panagakis, C. Kotropoulos, and G. R. Arce. Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification. IEEE Trans. Audio, Speech, and Lang. Processing, 18(3):576-588, 2010.
- (2010) IEEE Trans. Audio, Speech, and Lang. Processing , vol.18 , Issue.3 , pp. 576-588
- Panagakis, Y.¹ Kotropoulos, C.² Arce, G.R.³

39
- 84863552197
- State of the art report: Audio-based music structure analysis
- J. Paulus, M. Müller, and A. Klapuri. State of the art report: Audio-based music structure analysis. In ISMIR, pages 625-636, 2010.
- (2010) ISMIR , pp. 625-636
- Paulus, J.¹ Müller, M.² Klapuri, A.³

40
- 77952744810
- Sparse representations in audio and music: From coding to source separation
- M. D. Plumbley, T. Blumensath, L. Daudet, R. Gribonval, and M. E. Davies. Sparse representations in audio and music: from coding to source separation. Proceedings of the IEEE, 98(6):995-1005, 2009.
- (2009) Proceedings of the IEEE , vol.98 , Issue.6 , pp. 995-1005
- Plumbley, M.D.¹ Blumensath, T.² Daudet, L.³ Gribonval, R.⁴ Davies, M.E.⁵

41
- 77949578539
- A text retrieval approach to content-based audio retrieval
- M. Riley, E. Heinen, and J. Ghosh. A text retrieval approach to content-based audio retrieval. In ISMIR, 2008.
- (2008) ISMIR
- Riley, M.¹ Heinen, E.² Ghosh, J.³

42
- 80052120713
- Enhancing multi-label music genre classification through ensemble techniques
- C. Sanden and J. Z. Zhang. Enhancing multi-label music genre classification through ensemble techniques. In SIGIR, pages 705-714, 2011.
- (2011) SIGIR , pp. 705-714
- Sanden, C.¹ Zhang, J.Z.²

43
- 0004094721
- MIT Press, Cambridge
- B. Scholköpf and A. J. Smola. Learning with Kernels. MIT Press, Cambridge, 2002.
- (2002) Learning with Kernels
- Scholköpf, B.¹ Smola, A.J.²

44
- 84905163906
- Constant-Q transform toolbox for music processing
- C. Schörkhuber and A. Klapuri. Constant-Q transform toolbox for music processing. In Sound and Music Computing Conf., 2010.
- (2010) Sound and Music Computing Conf.
- Schörkhuber, C.¹ Klapuri, A.²

45
- 77954583359
- Web-scale k-means clustering
- D. Sculley. Web-scale k-means clustering. In WWW, pages 1177-1178, 2010.
- (2010) WWW , pp. 1177-1178
- Sculley, D.¹

46
- 84856929813
- An integrated approach to music boundary detection
- M.-Y. Su, Y.-H. Yang, Y.-C. Lin, and H.-H. Chen. An integrated approach to music boundary detection. In ISMIR, pages 705-710, 2009.
- (2009) ISMIR , pp. 705-710
- Su, M.-Y.¹ Yang, Y.-H.² Lin, Y.-C.³ Chen, H.-H.⁴

47
- 0001287271
- Regression shrinkage and selection via the lasso
- R. Tibshirani. Regression shrinkage and selection via the lasso. J. Royal Statistical Soc., 58:267-288, 1996.
- (1996) J. Royal Statistical Soc. , vol.58 , pp. 267-288
- Tibshirani, R.¹

48
- 36448988223
- Towards musical query-by-semantic-description using the CAL500 data set
- DOI 10.1145/1277741.1277817, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
- D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet. Towards musical query-by-semantic-description using the CAL500 data set. In ACM SIGIR, pages 439-446, 2007. (Pubitemid 350164991)
- (2007) Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07 , pp. 439-446
- Turnbull, D.¹ Barrington, L.² Torres, D.³ Lanckriet, G.⁴

49
- 0036648502
- Musical genre classification of audio signals
- DOI 10.1109/TSA.2002.800560, PII 1011092002800560
- G. Tzanetakis and P. Cook. Musical genre classification of audio signals. IEEE Trans. Speech and Audio Processing, 10(5):293-302, 2002. (Pubitemid 34950067)
- (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.5 , pp. 293-302
- Tzanetakis, G.¹ Cook, P.²

50
- 84873594763
- Learning the similarity of audio music in bag-of-frames representation from tagged music data
- J.-C. Wang, H.-S. Lee, H.-M. Wang, and S.-K. Jeng. Learning the similarity of audio music in bag-of-frames representation from tagged music data. In ISMIR, 2011.
- (2011) ISMIR
- Wang, J.-C.¹ Lee, H.-S.² Wang, H.-M.³ Jeng, S.-K.⁴

51
- 84864146689
- Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval
- to appear
- J. Weston, S. Bengio, and P. Hamel. Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval. J. New Music Res., 2012. to appear.
- (2012) J. New Music Res.
- Weston, J.¹ Bengio, S.² Hamel, P.³

52
- 77952717202
- Sparse representation for computer vision and pattern recognition
- J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. Huang, and S. Yan. Sparse representation for computer vision and pattern recognition. Proceedings of the IEEE, 98(6):1031-1044, 2010.
- (2010) Proceedings of the IEEE , vol.98 , Issue.6 , pp. 1031-1044
- Wright, J.¹ Ma, Y.² Mairal, J.³ Sapiro, G.⁴ Huang, T.⁵ Yan, S.⁶

53
- 37849036011
- Evaluating bag-of-visual-words representations in scene classification
- J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo. Evaluating bag-of-visual-words representations in scene classification. In MIR, pages 197-206, 2007.
- (2007) MIR , pp. 197-206
- Yang, J.¹ Jiang, Y.-G.² Hauptmann, A.G.³ Ngo, C.-W.⁴

54
- 70450209196
- Linear spatial pyramid matching using sparse coding for image classification
- J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In IEEE CVPR, pages 1794-1801, 2009.
- (2009) IEEE CVPR , pp. 1794-1801
- Yang, J.¹ Yu, K.² Gong, Y.³ Huang, T.⁴

55
- 84861049017
- CRC Press, Cambridge
- Y.-H. Yang and H. H. Chen. Music Emotion Recognition. CRC Press, Cambridge, 2011.
- (2011) Music Emotion Recognition
- Yang, Y.-H.¹ Chen, H.H.²

56
- 84864115629
- Online detection of unusual events in videos via dynamic sparse coding
- B. Zhao, L. Fei-Fei, and E. P. Xing. Online detection of unusual events in videos via dynamic sparse coding. In IEEE CVPR, 2011.
- (2011) IEEE CVPR
- Zhao, B.¹ Fei-Fei, L.² Xing, E.P.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.