-
1
-
-
68249129760
-
Above the clouds: A berkeley view of cloud computing
-
EECS Department. University of California, Berkeley
-
Armbrust M, Fox A, Griffith R, Joseph AD, Katz RH, Konwinski A, Lee G, Patterson DA, Rabkin A, Stoica I, Zaharia M (2009) Above the clouds: A berkeley view of cloud computing. Tech rep, EECS Department. University of California, Berkeley
-
(2009)
Tech Rep
-
-
Armbrust, M.1
Fox, A.2
Griffith, R.3
Joseph, A.D.4
Katz, R.H.5
Konwinski, A.6
Lee, G.7
Patterson, D.A.8
Rabkin, A.9
Stoica, I.10
Zaharia, M.11
-
3
-
-
5444258997
-
A comparison of fast blocking methods for record linkage
-
Baxter R, Christen P, Churches T (2003) A comparison of fast blocking methods for record linkage. In: ACM SIGKDD, vol 3, pp 25-27
-
(2003)
ACM SIGKDD
, vol.3
, pp. 25-27
-
-
Baxter, R.1
Christen, P.2
Churches, T.3
-
4
-
-
77952372966
-
Adaptive duplicate detection using learnable string similarity measures
-
Bilenko M, Mooney RJ (2003) Adaptive duplicate detection using learnable string similarity measures. In: KDD, pp 39-48
-
(2003)
KDD
, pp. 39-48
-
-
Bilenko, M.1
Mooney, R.J.2
-
6
-
-
65449178105
-
Febrl -: An open source data cleaning, deduplication and record linkage system with a graphical user interface
-
Christen P (2008) Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface. In: KDD, pp 1065-1068
-
(2008)
KDD
, pp. 1065-1068
-
-
Christen, P.1
-
8
-
-
85030321143
-
MapReduce: Simplified data processing on large clusters
-
Dean J, Ghemawat S (2004) MapReduce: Simplified data processing on large clusters. In: OSDI, pp 137-150
-
(2004)
OSDI
, pp. 137-150
-
-
Dean, J.1
Ghemawat, S.2
-
9
-
-
37549003336
-
MapReduce: Simplified data processing on large clusters
-
10.1145/1327452.1327492
-
J Dean S Ghemawat 2008 MapReduce: simplified data processing on large clusters Commun ACM 51 1 107 113 10.1145/1327452.1327492
-
(2008)
Commun ACM
, vol.51
, Issue.1
, pp. 107-113
-
-
Dean, J.1
Ghemawat, S.2
-
10
-
-
0026870271
-
Parallel database systems. The future of high performance database systems
-
DOI 10.1145/129888.129894
-
D DeWitt J Gray 1992 Parallel database systems: the future of high performance database systems Commun ACM 35 6 85 98 10.1145/129888.129894 (Pubitemid 23642225)
-
(1992)
Communications of the ACM
, vol.35
, Issue.6
, pp. 85-98
-
-
Dewitt David1
Gray Jim2
-
13
-
-
84876673365
-
-
Foundation AS
-
Foundation AS (2006) Hadoop. http://hadoop.apache.org/mapreduce/
-
(2006)
Hadoop
-
-
-
14
-
-
84976856849
-
The merge/purge problem for large databases
-
Hernández MA, Stolfo SJ (1995) The merge/purge problem for large databases. In: SIGMOD Conference, pp 127-138
-
(1995)
SIGMOD Conference
, pp. 127-138
-
-
Hernández, M.A.1
Stolfo, S.J.2
-
15
-
-
63449096255
-
Parallel linkage
-
Kim HS, Lee D (2007) Parallel linkage. In: CIKM, pp 283-292
-
(2007)
CIKM
, pp. 283-292
-
-
Kim, H.S.1
Lee, D.2
-
16
-
-
84876687819
-
Data partitioning for parallel entity matching
-
Kirsten T, Kolb L, Hartung M, Gross A, Köpcke H, Rahm E (2010) Data partitioning for parallel entity matching. In: 8th International Workshop on Quality in Databases
-
(2010)
8th International Workshop on Quality in Databases
-
-
Kirsten, T.1
Kolb, L.2
Hartung, M.3
Gross, A.4
Köpcke, H.5
Rahm, E.6
-
17
-
-
85059130497
-
Parallel sorted neighborhood blocking with mapreduce
-
Kolb L, Thor A, Rahm E (2011) Parallel sorted neighborhood blocking with mapreduce. In: BTW, pp 45-64
-
(2011)
BTW
, pp. 45-64
-
-
Kolb, L.1
Thor, A.2
Rahm, E.3
-
18
-
-
72649095071
-
Frameworks for entity matching: A comparison
-
10.1016/j.datak.2009.10.003
-
H Köpcke E Rahm 2010 Frameworks for entity matching: a comparison Data Knowl Eng 69 2 197 210 10.1016/j.datak.2009.10.003
-
(2010)
Data Knowl Eng
, vol.69
, Issue.2
, pp. 197-210
-
-
Köpcke, H.1
Rahm, E.2
-
19
-
-
80455148340
-
Evaluation of entity resolution approaches on real-world match problems
-
Köpcke H, Thor A, Rahm E (2010) Evaluation of entity resolution approaches on real-world match problems. In: VLDB, pp 484-493
-
(2010)
VLDB
-
-
Köpcke, H.1
Thor, A.2
Rahm, E.3
-
20
-
-
77954338155
-
Learning-based approaches for matching web data entities
-
10.1109/MIC.2010.58
-
H Köpcke A Thor E Rahm 2010 Learning-based approaches for matching web data entities IEEE Internet Comput 14 23 31 10.1109/MIC.2010.58
-
(2010)
IEEE Internet Comput
, vol.14
, pp. 23-31
-
-
Köpcke, H.1
Thor, A.2
Rahm, E.3
-
21
-
-
84964816728
-
Data-intensive text processing with mapreduce
-
10.2200/S00274ED1V01Y201006HLT007
-
J Lin C Dyer 2010 Data-intensive text processing with mapreduce Synth Lect Hum Lang Technol 3 1 1 177 10.2200/S00274ED1V01Y201006HLT007
-
(2010)
Synth Lect Hum Lang Technol
, vol.3
, Issue.1
, pp. 1-177
-
-
Lin, J.1
Dyer, C.2
-
22
-
-
0002490026
-
Data cleaning: Problems and current approaches
-
E Rahm HH Do 2000 Data cleaning: problems and current approaches IEEE Data Eng Bull 23 4 3 13
-
(2000)
IEEE Data Eng Bull
, vol.23
, Issue.4
, pp. 3-13
-
-
Rahm, E.1
Do, H.H.2
-
23
-
-
77954744650
-
Efficient parallel set-similarity joins using mapreduce
-
10.1145/1807167.1807222
-
Vernica R, Carey MJ, Li C (2010) Efficient parallel set-similarity joins using mapreduce. In: SIGMOD Conference, pp 495-506
-
(2010)
SIGMOD Conference
, pp. 495-506
-
-
Vernica, R.1
Carey, M.J.2
Li, C.3
|