메뉴 건너뛰기




Volumn 45, Issue 1, 2015, Pages 1-34

Greedy column subset selection for large-scale data sets

Author keywords

Big data; Column subset selection; Distributed computing; Greedy algorithms; MapReduce

Indexed keywords

ADVANCED ANALYTICS; APPROXIMATION THEORY; BIG DATA; DATA ANALYTICS; DISTRIBUTED COMPUTER SYSTEMS; ERRORS; GENETIC ALGORITHMS; SET THEORY;

EID: 84942294797     PISSN: 02191377     EISSN: 02193116     Source Type: Journal    
DOI: 10.1007/s10115-014-0801-8     Document Type: Article
Times cited : (52)

References (59)
  • 1
    • 0038166193 scopus 로고    scopus 로고
    • Database-friendly random projections: Johnson-Lindenstrauss with binary coins
    • Achlioptas D (2003) Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comput Syst Sci 66(4):671–687
    • (2003) J Comput Syst Sci , vol.66 , Issue.4 , pp. 671-687
    • Achlioptas, D.1
  • 2
    • 0032083715 scopus 로고    scopus 로고
    • Computing rank-revealing QR factorizations of dense matrices
    • Bischof C, Quintana-Ortí G (1998) Computing rank-revealing QR factorizations of dense matrices. ACM Trans Math Softw 24(2):226–253
    • (1998) ACM Trans Math Softw , vol.24 , Issue.2 , pp. 226-253
    • Bischof, C.1    Quintana-Ortí, G.2
  • 3
    • 84863303500 scopus 로고    scopus 로고
    • Near optimal column-based matrix reconstruction. In: Proceedings of the 52nd annual IEEE symposium on foundations of computer science (FOCS’11)
    • Boutsidis C, Drineas P, Magdon-Ismail M (2011) Near optimal column-based matrix reconstruction. In: Proceedings of the 52nd annual IEEE symposium on foundations of computer science (FOCS’11), pp 305–314
    • (2011) pp 305–314
    • Boutsidis, C.1    Drineas, P.2    Magdon-Ismail, M.3
  • 4
    • 84908877925 scopus 로고    scopus 로고
    • An improved approximation algorithm for the column subset selection problem
    • Boutsidis C, Mahoney MW, Drineas P (2008a) An improved approximation algorithm for the column subset selection problem, CoRR abs/0812.4293
    • (2008) CoRR abs/0812 , pp. 4293
    • Boutsidis, C.1    Mahoney, M.W.2    Drineas, P.3
  • 5
    • 65449139217 scopus 로고    scopus 로고
    • Unsupervised feature selection for principal components analysis. In Li Y, Liu B, Sarawagi S (eds) Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’08). ACM
    • Boutsidis C, Mahoney MW Drineas P (2008b) Unsupervised feature selection for principal components analysis. In Li Y, Liu B, Sarawagi S (eds) Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’08). ACM, New York, pp 61–69
    • (2008) New York , pp. 61-69
    • Boutsidis, C.1    Mahoney MW Drineas, P.2
  • 6
    • 70349152160 scopus 로고    scopus 로고
    • An improved approximation algorithm for the column subset selection problem. In: Proceedings of the 20th annual ACM-SIAM symposium on discrete algorithms (SODA’09)
    • Boutsidis C, Mahoney MW, Drineas P (2009) An improved approximation algorithm for the column subset selection problem. In: Proceedings of the 20th annual ACM-SIAM symposium on discrete algorithms (SODA’09), pp 968–977
    • (2009) pp 968–977
    • Boutsidis, C.1    Mahoney, M.W.2    Drineas, P.3
  • 7
    • 70349339356 scopus 로고    scopus 로고
    • Clustered subset selection and its applications on it service metrics. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM’08)
    • Boutsidis C, Sun J, Anerousis N (2008) Clustered subset selection and its applications on it service metrics. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM’08), pp 599–608
    • (2008) pp 599–608
    • Boutsidis, C.1    Sun, J.2    Anerousis, N.3
  • 9
  • 10
    • 84856610315 scopus 로고    scopus 로고
    • Column subset selection via sparse approximation of SVD
    • Çivril A, Magdon-Ismail M (2012) Column subset selection via sparse approximation of SVD. Theoret Comput Sci 421:1–14
    • (2012) Theoret Comput Sci , vol.421 , pp. 1-14
    • Çivril, A.1    Magdon-Ismail, M.2
  • 11
    • 0002170151 scopus 로고
    • Rank revealing QR factorizations
    • Chan T (1987) Rank revealing QR factorizations. Linear Algebra Appl 88:67–82
    • (1987) Linear Algebra Appl , vol.88 , pp. 67-82
    • Chan, T.1
  • 14
    • 0037236821 scopus 로고    scopus 로고
    • An elementary proof of a theorem of Johnson and Lindenstrauss
    • Dasgupta S, Gupta A (2003) An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct Algorithms 22(1):60–65
    • (2003) Random Struct Algorithms , vol.22 , Issue.1 , pp. 60-65
    • Dasgupta, S.1    Gupta, A.2
  • 15
    • 37549003336 scopus 로고    scopus 로고
    • MapReduce: simplified data processing on large clusters
    • Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
    • (2008) Commun ACM , vol.51 , Issue.1 , pp. 107-113
    • Dean, J.1    Ghemawat, S.2
  • 17
    • 78751516882 scopus 로고    scopus 로고
    • Efficient volume sampling for row/column subset selection. In: Proceedings of the 51st annual IEEE symposium on foundations of computer science (FOCS’10)
    • Deshpande A, Rademacher L (2010) Efficient volume sampling for row/column subset selection. In: Proceedings of the 51st annual IEEE symposium on foundations of computer science (FOCS’10), pp 329–338
    • (2010) pp 329–338
    • Deshpande, A.1    Rademacher, L.2
  • 18
    • 45849092005 scopus 로고    scopus 로고
    • Matrix approximation and projective clustering via volume sampling
    • Deshpande A, Rademacher L, Vempala S, Wang G (2006a) Matrix approximation and projective clustering via volume sampling. Theory Comput 2(1):225–247
    • (2006) Theory Comput , vol.2 , Issue.1 , pp. 225-247
    • Deshpande, A.1    Rademacher, L.2    Vempala, S.3    Wang, G.4
  • 20
    • 3142750484 scopus 로고    scopus 로고
    • Clustering large graphs via the singular value decomposition
    • Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56(1–3):9–33
    • (2004) Mach Learn , vol.56 , Issue.1-3 , pp. 9-33
    • Drineas, P.1    Frieze, A.2    Kannan, R.3    Vempala, S.4    Vinay, V.5
  • 21
    • 33751075906 scopus 로고    scopus 로고
    • Fast Monte Carlo algorithms for matrices II: computing a low-rank approximation to a matrix
    • Drineas P, Kannan R, Mahoney M (2007) Fast Monte Carlo algorithms for matrices II: computing a low-rank approximation to a matrix. SIAM J Comput 36(1):158–183
    • (2007) SIAM J Comput , vol.36 , Issue.1 , pp. 158-183
    • Drineas, P.1    Kannan, R.2    Mahoney, M.3
  • 22
    • 33750079844 scopus 로고    scopus 로고
    • Drineas P, Mahoney M, Muthukrishnan S (2006) Subspace sampling and relative-error matrix approximation: column-based methods. Approximation, randomization, and combinatorial optimization. Algorithms and techniques. Springer, Berlin, pp 316–326
    • Drineas P, Mahoney M, Muthukrishnan S (2006) Subspace sampling and relative-error matrix approximation: column-based methods. Approximation, randomization, and combinatorial optimization. Algorithms and techniques. Springer, Berlin, pp 316–326
  • 23
    • 84942300515 scopus 로고    scopus 로고
    • Embed and conquer: scalable embeddings for kernel k-means on mapreduce
    • Elgohary A, Farahat AK, Kamel MS, Karray F (2013) Embed and conquer: scalable embeddings for kernel k-means on mapreduce, CoRR abs/1311.2334
    • (2013) CoRR abs/1311 , pp. 2334
    • Elgohary, A.1    Farahat, A.K.2    Kamel, M.S.3    Karray, F.4
  • 24
    • 84859921422 scopus 로고    scopus 로고
    • Pairwise document similarity in large collections with MapReduce. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: short Papers (HLT’08)
    • Elsayed T, Lin J, Oard DW (2008) Pairwise document similarity in large collections with MapReduce. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: short Papers (HLT’08), pp 265–268
    • (2008) pp 265–268
    • Elsayed, T.1    Lin, J.2    Oard, D.W.3
  • 25
    • 80052666000 scopus 로고    scopus 로고
    • Fast clustering using MapReduce. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’11)
    • Ene A, Im S, Moseley B (2011) Fast clustering using MapReduce. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’11), pp 681–689
    • (2011) pp 681–689
    • Ene, A.1    Im, S.2    Moseley, B.3
  • 26
    • 84894656842 scopus 로고    scopus 로고
    • Distributed column subset selection on MapReduce. In: Proceedings of the 13th IEEE international conference on data mining (ICDM’13)
    • Farahat A, Elgohary A, Ghodsi A, Kamel M (2013) Distributed column subset selection on MapReduce. In: Proceedings of the 13th IEEE international conference on data mining (ICDM’13), pp 171–180
    • (2013) pp 171–180
    • Farahat, A.1    Elgohary, A.2    Ghodsi, A.3    Kamel, M.4
  • 27
    • 84857146334 scopus 로고    scopus 로고
    • An efficient greedy method for unsupervised feature selection. In: Proceedings of the 11th IEEE international conference on data mining (ICDM’11)
    • Farahat AK, Ghodsi A, Kamel MS (2011) An efficient greedy method for unsupervised feature selection. In: Proceedings of the 11th IEEE international conference on data mining (ICDM’11), pp 161–170
    • (2011) pp 161–170
    • Farahat, A.K.1    Ghodsi, A.2    Kamel, M.S.3
  • 28
    • 84876024189 scopus 로고    scopus 로고
    • Efficient greedy feature selection for unsupervised learning
    • Farahat AK, Ghodsi A, Kamel MS (2013) Efficient greedy feature selection for unsupervised learning. Knowl Inf Syst 35(2):285–310
    • (2013) Knowl Inf Syst , vol.35 , Issue.2 , pp. 285-310
    • Farahat, A.K.1    Ghodsi, A.2    Kamel, M.S.3
  • 29
    • 0032308232 scopus 로고    scopus 로고
    • Fast Monte-Carlo algorithms for finding low-rank approximations. In: Proceedings of the 39th annual IEEE symposium on foundations of computer science (FOCS’98)
    • Frieze A, Kannan R, Vempala S (1998) Fast Monte-Carlo algorithms for finding low-rank approximations. In: Proceedings of the 39th annual IEEE symposium on foundations of computer science (FOCS’98), pp 370–378
    • (1998) pp 370–378
    • Frieze, A.1    Kannan, R.2    Vempala, S.3
  • 31
    • 0003216822 scopus 로고    scopus 로고
    • Efficient algorithms for computing a strong rank-revealing QR factorization
    • Gu M, Eisenstat SC (1996) Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM J Sci Comput 17(4):848–869
    • (1996) SIAM J Sci Comput , vol.17 , Issue.4 , pp. 848-869
    • Gu, M.1    Eisenstat, S.C.2
  • 32
    • 84860200361 scopus 로고    scopus 로고
    • Optimal column-based low-rank matrix reconstruction. In: Proceedings of the 21st annual ACM-SIAM symposium on discrete algorithms (SODA’12)
    • Guruswami V, Sinop AK (2012) Optimal column-based low-rank matrix reconstruction. In: Proceedings of the 21st annual ACM-SIAM symposium on discrete algorithms (SODA’12), pp 1207–1214
    • (2012) pp 1207–1214
    • Guruswami, V.1    Sinop, A.K.2
  • 33
    • 81555213068 scopus 로고    scopus 로고
    • An algorithm for the principal component analysis of large data sets
    • Halko N, Martinsson P-G, Shkolnisky Y, Tygert M (2011) An algorithm for the principal component analysis of large data sets. SIAM J Sci Comput 33(5):2580–2594
    • (2011) SIAM J Sci Comput , vol.33 , Issue.5 , pp. 2580-2594
    • Halko, N.1    Martinsson, P.-G.2    Shkolnisky, Y.3    Tygert, M.4
  • 40
    • 77951678492 scopus 로고    scopus 로고
    • A model of computation for MapReduce. In: Proceedings of the 21st annual ACM-SIAM symposium on discrete algorithms (SODA’10)
    • Karloff H, Suri S, Vassilvitskii S (2010) A model of computation for MapReduce. In: Proceedings of the 21st annual ACM-SIAM symposium on discrete algorithms (SODA’10), pp 938–948
    • (2010) pp 938–948
    • Karloff, H.1    Suri, S.2    Vassilvitskii, S.3
  • 41
    • 85018101587 scopus 로고    scopus 로고
    • CLUTO—a clustering toolkit, rechnical report #02-017. University of Minnesota
    • Karypis G (2003) CLUTO—a clustering toolkit, rechnical report #02-017. University of Minnesota, Department of Computer Science
    • (2003) Department of Computer Science
    • Karypis, G.1
  • 43
    • 18144420071 scopus 로고    scopus 로고
    • Acquiring linear subspaces for face recognition under variable lighting
    • Lee K, Ho J, Kriegman D (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27(5):684–698
    • (2005) IEEE Trans Pattern Anal Mach Intell , vol.27 , Issue.5 , pp. 684-698
    • Lee, K.1    Ho, J.2    Kriegman, D.3
  • 45
    • 84876811202 scopus 로고    scopus 로고
    • Rcv1: a new benchmark collection for text categorization research
    • Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
    • (2004) J Mach Learn Res , vol.5 , pp. 361-397
    • Lewis, D.D.1    Yang, Y.2    Rose, T.G.3    Li, F.4
  • 46
    • 33749573641 scopus 로고    scopus 로고
    • Very sparse random projections. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06)
    • Li P, Hastie TJ, Church KW (2006) Very sparse random projections. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06), pp 287–296
    • (2006) pp 287–296
    • Li, P.1    Hastie, T.J.2    Church, K.W.3
  • 49
    • 77956505189 scopus 로고    scopus 로고
    • Convex principal feature selection. In: Proceedings of SIAM international conference on data mining (SDM)
    • Masaeli M, Yan Y, Cui Y, Fung, G, Dy J (2010) Convex principal feature selection. In: Proceedings of SIAM international conference on data mining (SDM), pp 619–628
    • (2010) pp 619–628
    • Masaeli, M.1    Yan, Y.2    Cui, Y.3    Fung, G.4    Dy, J.5
  • 50
    • 85018097192 scopus 로고    scopus 로고
    • Robust regression on mapreduce. In: Proceedings of the 30th international conference on machine learning (ICML-13)
    • Meng X, Mahoney M (2013) Robust regression on mapreduce. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 888–896
    • (2013) pp 888–896
    • Meng, X.1    Mahoney, M.2
  • 51
    • 0036522403 scopus 로고    scopus 로고
    • Unsupervised feature selection using feature similarity
    • Mitra P, Murthy C, Pal S (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312
    • (2002) IEEE Trans Pattern Anal Mach Intell , vol.24 , Issue.3 , pp. 301-312
    • Mitra, P.1    Murthy, C.2    Pal, S.3
  • 52
    • 0034392123 scopus 로고    scopus 로고
    • On the existence and computation of rank-revealing LU factorizations
    • Pan C (2000) On the existence and computation of rank-revealing LU factorizations. Linear Algebra Appl 316(1):199–222
    • (2000) Linear Algebra Appl , vol.316 , Issue.1 , pp. 199-222
    • Pan, C.1
  • 53
    • 0347380229 scopus 로고    scopus 로고
    • The CMU pose, illumination, and expression database
    • Sim T, Baker S, Bsat M (2003) The CMU pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618
    • (2003) IEEE Trans Pattern Anal Mach Intell , vol.25 , Issue.12 , pp. 1615-1618
    • Sim, T.1    Baker, S.2    Bsat, M.3
  • 55
    • 54749092170 scopus 로고    scopus 로고
    • 80 Million tiny images: a large data set for nonparametric object and scene recognition
    • Torralba A, Fergus R, Freeman W (2008) 80 Million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970
    • (2008) IEEE Trans Pattern Anal Mach Intell , vol.30 , Issue.11 , pp. 1958-1970
    • Torralba, A.1    Fergus, R.2    Freeman, W.3
  • 57
    • 27844550205 scopus 로고    scopus 로고
    • Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weight-based approach
    • Wolf L, Shashua A (2005) Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weight-based approach. J Mach Learn Res 6:1855–1887
    • (2005) J Mach Learn Res , vol.6 , pp. 1855-1887
    • Wolf, L.1    Shashua, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.