-
1
-
-
84865633903
-
-
David Sifry's Blog. http://www.sifry.com/alerts/.
-
-
-
-
2
-
-
84865633905
-
-
Google Blog Search. http://blogsearch.google.com/blogsearch.
-
-
-
-
3
-
-
84865646867
-
-
Google News. http://news.google.com.
-
-
-
-
4
-
-
84865660488
-
-
Google Book Search. http://books.google.com/.
-
-
-
-
5
-
-
84865660484
-
-
Yahoo News. http://news.yahoo.com.
-
-
-
-
9
-
-
85104914015
-
Efficient exact set-similarity joins
-
A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, 2006.
-
(2006)
VLDB
-
-
Arasu, A.1
Ganti, V.2
Kaushik, R.3
-
10
-
-
35348849154
-
Scaling up all Pairs similarity search
-
R.J. Bayardo, Y. Ma, and R. Srikant. Scaling Up All Pairs Similarity Search. In WWW, 2007.
-
(2007)
WWW
-
-
Bayardo, R.J.1
Ma, Y.2
Srikant, R.3
-
11
-
-
0037870443
-
The X-tree: An index structure for high-dimensional data
-
S. Berchtold, D.A. Keim, and H. Kriegei. The X-tree: An Index Structure for High-Dimensional Data. In VLDB, 1996.
-
(1996)
VLDB
-
-
Berchtold, S.1
Keim, D.A.2
Kriegei, H.3
-
15
-
-
84976810280
-
Copy detection mechanisms for digital documents
-
S. Brin, J. Davis, and H. Garcia-Molina. Copy detection mechanisms for digital documents. In SIGMOD, 1995.
-
(1995)
SIGMOD
-
-
Brin, S.1
Davis, J.2
Garcia-Molina, H.3
-
16
-
-
0032664793
-
The hybrid tree: An index structure for high dimensional feature spaces
-
K. Chakrabarti, and S. Mehrotra. The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces. In ICDE, 1999.
-
(1999)
ICDE
-
-
Chakrabarti, K.1
Mehrotra, S.2
-
17
-
-
33749597967
-
A primitive operator for similarity joins in data cleaning
-
S. Chaudhuri, V. Ganti, and R. Kaushik. A Primitive Operator for Similarity Joins in Data Cleaning. In ICDE, 2006.
-
(2006)
ICDE
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
18
-
-
3042606353
-
Shared information and program plagiarism detection
-
X. Chen, B. Francia, M. Li, and B. Mckinnon. Shared Information and Program Plagiarism Detection. IEEE Transactions on Information Theory, 50 (7), 1545-1551, 2004.
-
(2004)
IEEE Transactions on Information Theory
, vol.50
, Issue.7
, pp. 1545-1551
-
-
Chen, X.1
Francia, B.2
Li, M.3
McKinnon, B.4
-
19
-
-
36849049806
-
Structural and temporal analysis of the blogosphere through community factorization
-
Y. Chi, S. Zhu, X. Song, J. Tatemura, and B.L. Tseng. Structural and temporal analysis of the blogosphere through community factorization. In SIGKDD, 2007.
-
(2007)
SIGKDD
-
-
Chi, Y.1
Zhu, S.2
Song, X.3
Tatemura, J.4
Tseng, B.L.5
-
21
-
-
0013206133
-
Collection statistics for fast duplicate document detection
-
A. Chowdhury, O. Frieder, D. Grossman, M.C. McCabe. Collection statistics for fast duplicate document detection. ACM TOIS, v.20 n.2, p.171-191, 2002.
-
(2002)
ACM Tois
, vol.20
, Issue.2
, pp. 171-191
-
-
Chowdhury, A.1
Frieder, O.2
Grossman, D.3
McCabe, M.C.4
-
22
-
-
15044355327
-
Similarity search in high dimensions via hashing
-
A. Gionis, P. Indyk, and R. Motwani. Similarity Search in High Dimensions via Hashing. In VLDB, 1999.
-
(1999)
VLDB
-
-
Gionis, A.1
Indyk, P.2
Motwani, R.3
-
23
-
-
84944318804
-
Approximate string joins in a database (almost) for free
-
L. Gravano, P.G. Ipeirotis, H.V. Jagadish, N.Koudas, S. Muthukrishnan, and D. Srivastava Approximate String Joins in a Database (Almost) for Free. In VLDB, 2001.
-
(2001)
VLDB
-
-
Gravano, L.1
Ipeirotis, P.G.2
Jagadish, H.V.3
Koudas, N.4
Muthukrishnan, S.5
Srivastava, D.6
-
27
-
-
0031162081
-
The SR-tree: An index structure for high-dimensional nearest neighbor queries
-
N. Katayama and S. Satoh. The SR-tree: an index structure for high-dimensional nearest neighbor queries. In SIGMOD, 1997.
-
(1997)
SIGMOD
-
-
Katayama, N.1
Satoh, S.2
-
28
-
-
47749095961
-
CDIP: Collection-driven, yet individuality-preserving automated blog tagging
-
J.W. Kim, K.S. Candan, and J.Tatemura. CDIP: Collection-Driven, yet Individuality-Preserving Automated Blog Tagging. In ICSC, 2007.
-
(2007)
ICSC
-
-
Kim, J.W.1
Candan, K.S.2
Tatemura, J.3
-
30
-
-
57349180452
-
Generating links by mining quotations
-
O. Kolak, and B.N. Schilit. Generating links by mining quotations. In HT, 2008.
-
(2008)
HT
-
-
Kolak, O.1
Schilit, B.N.2
-
32
-
-
35348911985
-
Detecting near duplicates for web crawling
-
G.S. Manku, A. Jain and A.D.Sarma. Detecting Near Duplicates for Web Crawling. In WWW, 2007.
-
(2007)
WWW
-
-
Manku, G.S.1
Jain, A.2
Sarma, A.D.3
-
33
-
-
33745797351
-
Similarity measures for tracking information flow
-
D. Metzler, Y. Bernstein, W.B. Croft, A. Moffat, and J. Zobel. Similarity Measures for Tracking Information Flow. In CIKM, 2005.
-
(2005)
CIKM
-
-
Metzler, D.1
Bernstein, Y.2
Croft, W.B.3
Moffat, A.4
Zobel, J.5
-
34
-
-
1142267351
-
Winnowing: Local algorithms for document fingerprinting
-
S. Schleimer, D.S. Wilkerson, and A. Aiken. Winnowing: Local Algorithms for Document Fingerprinting. In SIGMOD, 2003.
-
(2003)
SIGMOD
-
-
Schleimer, S.1
Wilkerson, D.S.2
Aiken, A.3
-
35
-
-
85088005959
-
Efficient set joins on similarity predicates
-
S. Sarawagi, and A. Kirpa. Efficient set joins on similarity predicates. In SIGMOD, 2004.
-
(2004)
SIGMOD
-
-
Sarawagi, S.1
Kirpa, A.2
-
38
-
-
33750311279
-
Near-duplicate detection by instance-level constrained clustering
-
H. Yang, and J. Callan Near-duplicate detection by instance-level constrained clustering. In SIGIR, 2006.
-
(2006)
SIGIR
-
-
Yang, H.1
Callan, J.2
-
39
-
-
0032268976
-
Inverted files versus signature files for text indexing
-
Dec.
-
J. Zobel, A. Moffat, and K. Ramamohanarao. Inverted files versus signature files for text indexing. ACM Transactions on Database Systems(TODS), 23(4), 453-490, Dec. 1998.
-
(1998)
ACM Transactions on Database Systems(TODS)
, vol.23
, Issue.4
, pp. 453-490
-
-
Zobel, J.1
Moffat, A.2
Ramamohanarao, K.3
-
40
-
-
66249113620
-
Efficient similarity joins for near duplicate detection
-
C. Xiao, W. Wang, X. Lin, and J.X. Yu. Efficient Similarity Joins for Near Duplicate Detection. In WWW, 2008.
-
(2008)
WWW
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
Yu, J.X.4
|