메뉴 건너뛰기




Volumn P-214, Issue , 2013, Pages 37-56

MR-DSJ: Distance-based self-join for large-scale vector data analysis with MapReduce

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; DATA HANDLING; DATA MINING; FILTRATION; INFORMATION ANALYSIS; ITERATIVE METHODS;

EID: 84901806243     PISSN: 16175468     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (15)

References (32)
  • 2
    • 79957809015 scopus 로고    scopus 로고
    • HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads
    • [ABPA+09]
    • [ABPA+09] Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi Silberschatz. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. PVLDB, 2(1):922-933, 2009.
    • (2009) PVLDB , vol.2 , Issue.1 , pp. 922-933
    • Abouzeid, A.1    Bajda-Pawlikowski, K.2    Abadi, D.J.3    Rasin, A.4    Silberschatz, A.5
  • 4
    • 77952265514 scopus 로고    scopus 로고
    • Optimizing joins ina Map-Reduce environment
    • [AU10]
    • [AU10] Foto N. Afrati and JeffreyD. Ullman. Optimizing Joins inaMap-Reduce Environment. In EDBT, pages 99-110, 2010.
    • (2010) EDBT , pp. 99-110
    • Afrati, F.N.1    Ullman, J.D.2
  • 5
    • 84882366774 scopus 로고    scopus 로고
    • High performance clustering based on the similarity join
    • [BBBK00]
    • [BBBK00] Christian Böhm, Bernhard Braunmüller, Markus M. Breunig, and Hans-Peter Kriegel. High Performance Clustering Based on the Similarity Join. In CIKM, pages 298-305, 2000.
    • (2000) CIKM , pp. 298-305
    • Böhm, C.1    Braunmüller, B.2    Breunig, M.M.3    Kriegel, H.-P.4
  • 6
    • 0034831593 scopus 로고    scopus 로고
    • Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data
    • [BBKK01]
    • [BBKK01] C. Böhm, B. Braunmüller, F. Krebs, and H.P. Kriegel. Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data. In SIGMOD, pages 379-388, 2001.
    • (2001) SIGMOD , pp. 379-388
    • Böhm, C.1    Braunmüller, B.2    Krebs, F.3    Kriegel, H.P.4
  • 7
    • 0027621672 scopus 로고
    • Efficient processing of spatial joins using R-Trees
    • [BKS93]
    • [BKS93] Thomas Brinkhoff, Hans-Peter Kriegel, and Bernhard Seeger. Efficient Processing of Spatial Joins Using R-Trees. In SIGMOD, pages 237-246, 1993.
    • (1993) SIGMOD , pp. 237-246
    • Brinkhoff, T.1    Hans-Peter Kriegel2    Seeger, B.3
  • 8
    • 79951767112 scopus 로고    scopus 로고
    • Document similarity self-join with MapReduce
    • [BML10]
    • [BML10] Ranieri Baraglia, Gianmarco De Francisci Morales, and Claudio Lucchese. Document Similarity Self-Join with MapReduce. In ICDM, pages 731-736, 2010.
    • (2010) ICDM , pp. 731-736
    • Baraglia, R.1    De Francisci Morales, G.2    Lucchese, C.3
  • 9
    • 77954700016 scopus 로고    scopus 로고
    • A comparison of join algorithms for log processing in MapReduce
    • [BPE+10]
    • [BPE+10] Spyros Blanas, Jignesh M. Patel, Vuk Ercegovac, Jun Rao, Eugene J. Shekita, and Yuanyuan Tian. A comparison of join algorithms for log processing in MapReduce. In SIGMOD, pages 975-986, 2010.
    • (2010) SIGMOD , pp. 975-986
    • Blanas, S.1    Patel, J.M.2    Ercegovac, V.3    Rao, J.4    Shekita, E.J.5    Tian, Y.6
  • 10
    • 33749597967 scopus 로고    scopus 로고
    • A primitive operator for similarity joins in data cleaning
    • [CGK06]
    • [CGK06] Surajit Chaudhuri, Venkatesh Ganti, and Raghav Kaushik. A Primitive Operator for Similarity Joins in Data Cleaning. In ICDE, page 5, 2006.
    • (2006) ICDE , pp. 5
    • Chaudhuri, S.1    Ganti, V.2    Kaushik, R.3
  • 11
    • 84922696786 scopus 로고    scopus 로고
    • The relational database dictionary -A comprehensive glossary of relational terms and concepts, with illustrative examples
    • [Dat06]
    • [Dat06] Chris J. Date. The relational database dictionary -A comprehensive glossary of relational terms and concepts, with illustrative examples. O'Reilly, 2006.
    • (2006) O'Reilly
    • Date, C.J.1
  • 12
    • 85030321143 scopus 로고    scopus 로고
    • MapReduce: Simplified data processing on large clusters
    • [DG04]
    • [DG04] Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, pages 137-150, 2004.
    • (2004) OSDI , pp. 137-150
    • Dean, J.1    Ghemawat, S.2
  • 13
    • 0001906476 scopus 로고
    • Practical skew handling in parallel joins
    • [DNSS92]
    • [DNSS92] David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, and S. Seshadri. Practical Skew Handling in Parallel Joins. In VLDB, pages 27-40, 1992.
    • (1992) VLDB , pp. 27-40
    • DeWitt, D.J.1    Naughton, J.F.2    Schneider, D.A.3    Seshadri, S.4
  • 14
    • 80053521271 scopus 로고    scopus 로고
    • Hadoop++: Making a yellow elephant run like a cheetah
    • [DQRJ+10]
    • [DQRJ+10] Jens Dittrich, Jorge-Arnulfo Quiané-Ruiz, Alekh Jindal, Yagiz Kargin, Vinay Setty, and Jörg Schad. Hadoop++: Making a Yellow Elephant Run Like a Cheetah. PVLDB, 3:518-529, 2010.
    • (2010) PVLDB , vol.3 , pp. 518-529
    • Dittrich, J.1    Quiané-Ruiz, J.-A.2    Jindal, A.3    Kargin, Y.4    Setty, V.5    Schad, J.6
  • 15
    • 0033872455 scopus 로고    scopus 로고
    • Data redundancy and duplicate detection in spatial join processing
    • [DS00]
    • [DS00] Jens-Peter Dittrich and Bernhard Seeger. Data Redundancy and Duplicate Detection in Spatial Join Processing. In ICDE, pages 535-546, 2000.
    • (2000) ICDE , pp. 535-546
    • Dittrich, J.-P.1    Seeger, B.2
  • 16
    • 45749146270 scopus 로고    scopus 로고
    • A density-based algorithm for discovering clusters in large spatial databases with noise
    • [EKSX96]
    • [EKSX96] Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In SIGKDD, pages 226-231, 1996.
    • (1996) SIGKDD , pp. 226-231
    • Ester, M.1    Hans-Peter Kriegel2    Jörg Sander3    Xiaowei Xu.4
  • 17
    • 0033366945 scopus 로고    scopus 로고
    • Extended edited synoptic cloud reports from ships and land stations over the globe, 1952-1996
    • [HW99] Carbon Dioxide Information Analysis Center
    • [HW99] C.J. Hahn and S.G. Warren. Extended edited synoptic cloud reports from ships and land stations over the globe, 1952-1996. NDP026C, Carbon Dioxide Information Analysis Center, 1999.
    • (1999) NDP026C
    • Hahn, C.J.1    Warren, S.G.2
  • 18
    • 84873205384 scopus 로고    scopus 로고
    • Efficient processing of k nearest neighbor joins using mapreduce
    • [LSCO12]
    • [LSCO12] Wei Lu, Yanyan Shen, Su Chen, and Beng Chin Ooi. Efficient Processing of k Nearest Neighbor Joins using MapReduce. PVLDB, 5(10):1016-1027, 2012.
    • (2012) PVLDB , vol.5 , Issue.10 , pp. 1016-1027
    • Lu, W.1    Shen, Y.2    Su Chen3    Ooi, B.C.4
  • 19
    • 84863758126 scopus 로고    scopus 로고
    • V-smart-join: A scalable MapReduce framework for all-pair similarity joins of multisets and vectors
    • [MF12]
    • [MF12] Ahmed Metwally and Christos Faloutsos. V-SMART-Join: A Scalable MapReduce Framework for All-Pair Similarity Joins of Multisets and Vectors. PVLDB, 5(8):704-715, 2012.
    • (2012) PVLDB , vol.5 , Issue.8 , pp. 704-715
    • Metwally, A.1    Faloutsos, C.2
  • 20
    • 0002089617 scopus 로고    scopus 로고
    • Matching algorithms within a duplicate detection system
    • [Mon00]
    • [Mon00] Alvaro E. Monge. Matching Algorithms within a Duplicate Detection System. IEEE Data Eng. Bull., 23(4):14-20, 2000.
    • (2000) IEEE Data Eng. Bull , vol.23 , Issue.4 , pp. 14-20
    • Monge, A.E.1
  • 21
    • 79960020260 scopus 로고    scopus 로고
    • Processing theta-joins using MapReduce
    • [OR11]
    • [OR11] Alper Okcan and Mirek Riedewald. Processing theta-joins using MapReduce. In SIGMOD, pages 949-960, 2011.
    • (2011) SIGMOD , pp. 949-960
    • Okcan, A.1    Riedewald, M.2
  • 22
    • 0030157411 scopus 로고    scopus 로고
    • Partition based spatial-merge join
    • [PD96]
    • [PD96] Jignesh M. Patel and David J. DeWitt. Partition Based Spatial-Merge Join. In SIGMOD Conference, pages 259-270, 1996.
    • (1996) SIGMOD Conference , pp. 259-270
    • Patel, J.M.1    DeWitt, D.J.2
  • 24
    • 84976703615 scopus 로고
    • Nearest neighbor queries
    • [RKV95]
    • [RKV95] Nick Roussopoulos, Stephen Kelley, and Frédéic Vincent. Nearest Neighbor Queries. In SIGMOD, pages 71-79, 1995.
    • (1995) SIGMOD , pp. 71-79
    • Roussopoulos, N.1    Kelley, S.2    Vincent, F.3
  • 25
    • 0039845384 scopus 로고    scopus 로고
    • Efficient algorithms for mining outliers from large data sets
    • [RRS00]
    • [RRS00] Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. Efficient Algorithms for Mining Outliers from Large Data Sets. In SIGMOD, pages 427-438, 2000.
    • (2000) SIGMOD , pp. 427-438
    • Ramaswamy, S.1    Rastogi, R.2    Shim, K.3
  • 28
    • 77954744650 scopus 로고    scopus 로고
    • Efficient parallel set-similarity joins using MapReduce
    • [VCL10]
    • [VCL10] R. Vernica, M.J. Carey, and C. Li. Efficient parallel set-similarity joins using MapReduce. In SIGMOD, pages 495-506, 2010.
    • (2010) SIGMOD , pp. 495-506
    • Vernica, R.1    Carey, M.J.2    Li, C.3
  • 29
    • 57349141410 scopus 로고    scopus 로고
    • Efficient similarity joins for near duplicate detection
    • [XWLY08]
    • [XWLY08] Chuan Xiao, Wei Wang, Xuemin Lin, and Jeffrey Xu Yu. Efficient similarity joins for near duplicate detection. In WWW, pages 131-140, 2008.
    • (2008) WWW , pp. 131-140
    • Xiao, C.1    Wang, W.2    Lin, X.3    Jeffrey Xu Yu.4
  • 30
    • 72049084511 scopus 로고    scopus 로고
    • SJMR: Parallelizing spatial join with MapReduce on clusters
    • [ZHL+09]
    • [ZHL+09] Shubin Zhang, Jizhong Han, Zhiyong Liu, Kai Wang, and Zhiyong Xu. SJMR: Parallelizing spatial join with MapReduce on clusters. In CLUSTER, pages 1-8, 2009.
    • (2009) CLUSTER , pp. 1-8
    • Zhang, S.1    Han, J.2    Liu, Z.3    Wang, K.4    Zhiyong Xu.5
  • 31
    • 7444237650 scopus 로고    scopus 로고
    • AGRID: An efficient algorithm for clustering large high-dimensional datasets
    • [ZJ03]
    • [ZJ03] Yanchang Zhao and Song Junde. AGRID: An Efficient Algorithm for Clustering Large High-Dimensional Datasets. In PAKDD, pages 271-282, 2003.
    • (2003) PAKDD , pp. 271-282
    • Zhao, Y.1    Junde, S.2
  • 32
    • 84863510705 scopus 로고    scopus 로고
    • Efficient parallel kNN joins for large data in MapReduce
    • [ZLJ12]
    • [ZLJ12] Chi Zhang, Feifei Li, and Jeffrey Jestes. Efficient parallel kNN joins for large data in MapReduce. In EDBT, pages 38-49, 2012.
    • (2012) EDBT , pp. 38-49
    • Zhang, C.1    Li, F.2    Jestes, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.