SCOPUS 정보 검색 플랫폼

Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)

Volumn P-214, Issue , 2013, Pages 37-56

MR-DSJ: Distance-based self-join for large-scale vector data analysis with MapReduce

(3) Seidl, Thomas a Fries, Sergej a Boden, Brigitte a

a RWTH AACHEN UNIVERSITY (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; DATA HANDLING; DATA MINING; FILTRATION; INFORMATION ANALYSIS; ITERATIVE METHODS;

COMPUTER CLUSTERS; DATA MINING TASKS; DISTRIBUTED PROGRAMMING MODEL; EXPERIMENTAL EVALUATION; FILTER TECHNIQUES; GRID PARTITIONING; LARGE-SCALE VECTORS; NEAR-DUPLICATE DETECTION;

CLUSTERING ALGORITHMS;

EID: 84901806243 PISSN: 16175468 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (15)

References (32)

1
- 84862013435
- Massively parallel data analysis with PACTs on nephele
- [ABE+10]
- [ABE+10] Alexander Alexandrov, Dominic Battré, Stephan Ewen, Max Heimel, Fabian Hueske, Odej Kao, Volker Markl, Erik Nijkamp, and Daniel Warneke. Massively Parallel Data Analysis with PACTs on Nephele. PVLDB, 3:1625-1628, 2010.
- (2010) PVLDB , vol.3 , pp. 1625-1628
- Alexandrov, A.¹ Battré, D.² Ewen, S.³ Heimel, M.⁴ Hueske, F.⁵ Kao, O.⁶ Markl, V.⁷ Nijkamp, E.⁸ Warneke, D.⁹

2
- 79957809015
- HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads
- [ABPA+09]
- [ABPA+09] Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi Silberschatz. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. PVLDB, 2(1):922-933, 2009.
- (2009) PVLDB , vol.2 , Issue.1 , pp. 922-933
- Abouzeid, A.¹ Bajda-Pawlikowski, K.² Abadi, D.J.³ Rasin, A.⁴ Silberschatz, A.⁵

3
- 84864262965
- Fuzzy joins using MapReduce
- [ASM+12]
- [ASM+12] F.N. Afrati, A.D. Sarma, D. Menestrina, A. Parameswaran, and J.D. Ullman. Fuzzy Joins Using MapReduce. ICDE, 2012.
- (2012) ICDE
- Afrati, F.N.¹ Sarma, A.D.² Menestrina, D.³ Parameswaran, A.⁴ Ullman, J.D.⁵

4
- 77952265514
- Optimizing joins ina Map-Reduce environment
- [AU10]
- [AU10] Foto N. Afrati and JeffreyD. Ullman. Optimizing Joins inaMap-Reduce Environment. In EDBT, pages 99-110, 2010.
- (2010) EDBT , pp. 99-110
- Afrati, F.N.¹ Ullman, J.D.²

5
- 84882366774
- High performance clustering based on the similarity join
- [BBBK00]
- [BBBK00] Christian Böhm, Bernhard Braunmüller, Markus M. Breunig, and Hans-Peter Kriegel. High Performance Clustering Based on the Similarity Join. In CIKM, pages 298-305, 2000.
- (2000) CIKM , pp. 298-305
- Böhm, C.¹ Braunmüller, B.² Breunig, M.M.³ Kriegel, H.-P.⁴

6
- 0034831593
- Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data
- [BBKK01]
- [BBKK01] C. Böhm, B. Braunmüller, F. Krebs, and H.P. Kriegel. Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data. In SIGMOD, pages 379-388, 2001.
- (2001) SIGMOD , pp. 379-388
- Böhm, C.¹ Braunmüller, B.² Krebs, F.³ Kriegel, H.P.⁴

7
- 0027621672
- Efficient processing of spatial joins using R-Trees
- [BKS93]
- [BKS93] Thomas Brinkhoff, Hans-Peter Kriegel, and Bernhard Seeger. Efficient Processing of Spatial Joins Using R-Trees. In SIGMOD, pages 237-246, 1993.
- (1993) SIGMOD , pp. 237-246
- Brinkhoff, T.¹ Hans-Peter Kriegel² Seeger, B.³

8
- 79951767112
- Document similarity self-join with MapReduce
- [BML10]
- [BML10] Ranieri Baraglia, Gianmarco De Francisci Morales, and Claudio Lucchese. Document Similarity Self-Join with MapReduce. In ICDM, pages 731-736, 2010.
- (2010) ICDM , pp. 731-736
- Baraglia, R.¹ De Francisci Morales, G.² Lucchese, C.³

9
- 77954700016
- A comparison of join algorithms for log processing in MapReduce
- [BPE+10]
- [BPE+10] Spyros Blanas, Jignesh M. Patel, Vuk Ercegovac, Jun Rao, Eugene J. Shekita, and Yuanyuan Tian. A comparison of join algorithms for log processing in MapReduce. In SIGMOD, pages 975-986, 2010.
- (2010) SIGMOD , pp. 975-986
- Blanas, S.¹ Patel, J.M.² Ercegovac, V.³ Rao, J.⁴ Shekita, E.J.⁵ Tian, Y.⁶

10
- 33749597967
- A primitive operator for similarity joins in data cleaning
- [CGK06]
- [CGK06] Surajit Chaudhuri, Venkatesh Ganti, and Raghav Kaushik. A Primitive Operator for Similarity Joins in Data Cleaning. In ICDE, page 5, 2006.
- (2006) ICDE , pp. 5
- Chaudhuri, S.¹ Ganti, V.² Kaushik, R.³

11
- 84922696786
- The relational database dictionary -A comprehensive glossary of relational terms and concepts, with illustrative examples
- [Dat06]
- [Dat06] Chris J. Date. The relational database dictionary -A comprehensive glossary of relational terms and concepts, with illustrative examples. O'Reilly, 2006.
- (2006) O'Reilly
- Date, C.J.¹

12
- 85030321143
- MapReduce: Simplified data processing on large clusters
- [DG04]
- [DG04] Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, pages 137-150, 2004.
- (2004) OSDI , pp. 137-150
- Dean, J.¹ Ghemawat, S.²

13
- 0001906476
- Practical skew handling in parallel joins
- [DNSS92]
- [DNSS92] David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, and S. Seshadri. Practical Skew Handling in Parallel Joins. In VLDB, pages 27-40, 1992.
- (1992) VLDB , pp. 27-40
- DeWitt, D.J.¹ Naughton, J.F.² Schneider, D.A.³ Seshadri, S.⁴

14
- 80053521271
- Hadoop++: Making a yellow elephant run like a cheetah
- [DQRJ+10]
- [DQRJ+10] Jens Dittrich, Jorge-Arnulfo Quiané-Ruiz, Alekh Jindal, Yagiz Kargin, Vinay Setty, and Jörg Schad. Hadoop++: Making a Yellow Elephant Run Like a Cheetah. PVLDB, 3:518-529, 2010.
- (2010) PVLDB , vol.3 , pp. 518-529
- Dittrich, J.¹ Quiané-Ruiz, J.-A.² Jindal, A.³ Kargin, Y.⁴ Setty, V.⁵ Schad, J.⁶

15
- 0033872455
- Data redundancy and duplicate detection in spatial join processing
- [DS00]
- [DS00] Jens-Peter Dittrich and Bernhard Seeger. Data Redundancy and Duplicate Detection in Spatial Join Processing. In ICDE, pages 535-546, 2000.
- (2000) ICDE , pp. 535-546
- Dittrich, J.-P.¹ Seeger, B.²

16
- 45749146270
- A density-based algorithm for discovering clusters in large spatial databases with noise
- [EKSX96]
- [EKSX96] Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In SIGKDD, pages 226-231, 1996.
- (1996) SIGKDD , pp. 226-231
- Ester, M.¹ Hans-Peter Kriegel² Jörg Sander³ Xiaowei Xu.⁴

17
- 0033366945
- Extended edited synoptic cloud reports from ships and land stations over the globe, 1952-1996
- [HW99] Carbon Dioxide Information Analysis Center
- [HW99] C.J. Hahn and S.G. Warren. Extended edited synoptic cloud reports from ships and land stations over the globe, 1952-1996. NDP026C, Carbon Dioxide Information Analysis Center, 1999.
- (1999) NDP026C
- Hahn, C.J.¹ Warren, S.G.²

18
- 84873205384
- Efficient processing of k nearest neighbor joins using mapreduce
- [LSCO12]
- [LSCO12] Wei Lu, Yanyan Shen, Su Chen, and Beng Chin Ooi. Efficient Processing of k Nearest Neighbor Joins using MapReduce. PVLDB, 5(10):1016-1027, 2012.
- (2012) PVLDB , vol.5 , Issue.10 , pp. 1016-1027
- Lu, W.¹ Shen, Y.² Su Chen³ Ooi, B.C.⁴

19
- 84863758126
- V-smart-join: A scalable MapReduce framework for all-pair similarity joins of multisets and vectors
- [MF12]
- [MF12] Ahmed Metwally and Christos Faloutsos. V-SMART-Join: A Scalable MapReduce Framework for All-Pair Similarity Joins of Multisets and Vectors. PVLDB, 5(8):704-715, 2012.
- (2012) PVLDB , vol.5 , Issue.8 , pp. 704-715
- Metwally, A.¹ Faloutsos, C.²

20
- 0002089617
- Matching algorithms within a duplicate detection system
- [Mon00]
- [Mon00] Alvaro E. Monge. Matching Algorithms within a Duplicate Detection System. IEEE Data Eng. Bull., 23(4):14-20, 2000.
- (2000) IEEE Data Eng. Bull , vol.23 , Issue.4 , pp. 14-20
- Monge, A.E.¹

21
- 79960020260
- Processing theta-joins using MapReduce
- [OR11]
- [OR11] Alper Okcan and Mirek Riedewald. Processing theta-joins using MapReduce. In SIGMOD, pages 949-960, 2011.
- (2011) SIGMOD , pp. 949-960
- Okcan, A.¹ Riedewald, M.²

22
- 0030157411
- Partition based spatial-merge join
- [PD96]
- [PD96] Jignesh M. Patel and David J. DeWitt. Partition Based Spatial-Merge Join. In SIGMOD Conference, pages 259-270, 1996.
- (1996) SIGMOD Conference , pp. 259-270
- Patel, J.M.¹ DeWitt, D.J.²

23
- 70350512695
- A comparison of approaches to large-scale data analysis
- [PPR+09]
- [PPR+09] Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael Stonebraker. A comparison of approaches to large-scale data analysis. In SIGMOD, pages 165-178, 2009.
- (2009) SIGMOD , pp. 165-178
- Pavlo, A.¹ Paulson, E.² Rasin, A.³ Abadi, D.J.⁴ DeWitt, D.J.⁵ Madden, S.⁶ Stonebraker, M.⁷

24
- 84976703615
- Nearest neighbor queries
- [RKV95]
- [RKV95] Nick Roussopoulos, Stephen Kelley, and Frédéic Vincent. Nearest Neighbor Queries. In SIGMOD, pages 71-79, 1995.
- (1995) SIGMOD , pp. 71-79
- Roussopoulos, N.¹ Kelley, S.² Vincent, F.³

25
- 0039845384
- Efficient algorithms for mining outliers from large data sets
- [RRS00]
- [RRS00] Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. Efficient Algorithms for Mining Outliers from Large Data Sets. In SIGMOD, pages 427-438, 2000.
- (2000) SIGMOD , pp. 427-438
- Ramaswamy, S.¹ Rastogi, R.² Shim, K.³

26
- 84924183918
- [RU11] Cambridge Univ Pr
- [RU11] A. Rajaraman and J.D. Ullman. Mining of massive datasets. Cambridge Univ Pr, 2011.
- (2011) Mining of Massive Datasets
- Rajaraman, A.¹ Ullman, J.D.²

27
- 84900819032
- MapReduce-based similarity join for metric spaces
- [SRT12] New York, NY, USA, ACM
- [SRT12] Yasin N. Silva, Jason M. Reed, and Lisa M. Tsosie. MapReduce-based similarity join for metric spaces. In Proceedings of the 1st International Workshop on Cloud Intelligence, Cloud-I '12, pages 3:1-3:8, New York, NY, USA, 2012. ACM.
- (2012) Proceedings of the 1st International Workshop on Cloud Intelligence, Cloud-I '12 , pp. 31-38
- Silva, Y.N.¹ Reed, J.M.² Tsosie, L.M.³

28
- 77954744650
- Efficient parallel set-similarity joins using MapReduce
- [VCL10]
- [VCL10] R. Vernica, M.J. Carey, and C. Li. Efficient parallel set-similarity joins using MapReduce. In SIGMOD, pages 495-506, 2010.
- (2010) SIGMOD , pp. 495-506
- Vernica, R.¹ Carey, M.J.² Li, C.³

29
- 57349141410
- Efficient similarity joins for near duplicate detection
- [XWLY08]
- [XWLY08] Chuan Xiao, Wei Wang, Xuemin Lin, and Jeffrey Xu Yu. Efficient similarity joins for near duplicate detection. In WWW, pages 131-140, 2008.
- (2008) WWW , pp. 131-140
- Xiao, C.¹ Wang, W.² Lin, X.³ Jeffrey Xu Yu.⁴

30
- 72049084511
- SJMR: Parallelizing spatial join with MapReduce on clusters
- [ZHL+09]
- [ZHL+09] Shubin Zhang, Jizhong Han, Zhiyong Liu, Kai Wang, and Zhiyong Xu. SJMR: Parallelizing spatial join with MapReduce on clusters. In CLUSTER, pages 1-8, 2009.
- (2009) CLUSTER , pp. 1-8
- Zhang, S.¹ Han, J.² Liu, Z.³ Wang, K.⁴ Zhiyong Xu.⁵

31
- 7444237650
- AGRID: An efficient algorithm for clustering large high-dimensional datasets
- [ZJ03]
- [ZJ03] Yanchang Zhao and Song Junde. AGRID: An Efficient Algorithm for Clustering Large High-Dimensional Datasets. In PAKDD, pages 271-282, 2003.
- (2003) PAKDD , pp. 271-282
- Zhao, Y.¹ Junde, S.²

32
- 84863510705
- Efficient parallel kNN joins for large data in MapReduce
- [ZLJ12]
- [ZLJ12] Chi Zhang, Feifei Li, and Jeffrey Jestes. Efficient parallel kNN joins for large data in MapReduce. In EDBT, pages 38-49, 2012.
- (2012) EDBT , pp. 38-49
- Zhang, C.¹ Li, F.² Jestes, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.