-
1
-
-
24644523726
-
Beyond pairwise clustering
-
S. Agarwal, J. Lim, L. Zelnik-Manor, P. Perona, D. Kriegman, and S. Belongie. Beyond pairwise clustering. In CVPR, 2005.
-
(2005)
CVPR
-
-
Agarwal, S.1
Lim, J.2
Zelnik-Manor, L.3
Perona, P.4
Kriegman, D.5
Belongie, S.6
-
2
-
-
70349155038
-
Finding text reuse on the web
-
Barcelona, Spain
-
M. Bendersky and W. B. Croft. Finding text reuse on the web. In WSDM, pages 262-271, Barcelona, Spain, 2009.
-
(2009)
WSDM
, pp. 262-271
-
-
Bendersky, M.1
Croft, W.B.2
-
3
-
-
0031346696
-
On the resemblance and containment of documents
-
Positano, Italy
-
A. Z. Broder. On the resemblance and containment of documents. In the Compression and Complexity of Sequences, pages 21-29, Positano, Italy, 1997.
-
(1997)
The Compression and Complexity of Sequences
, pp. 21-29
-
-
Broder, A.Z.1
-
4
-
-
0010362121
-
Syntactic clustering of the web
-
Santa Clara, CA
-
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the web. In WWW, pages 1157 - 1166, Santa Clara, CA, 1997.
-
(1997)
WWW
, pp. 1157-1166
-
-
Broder, A.Z.1
Glassman, S.C.2
Manasse, M.S.3
Zweig, G.4
-
5
-
-
42549139837
-
A scalable pattern mining approach to web graph compression with communities
-
Stanford, CA
-
G. Buehrer and K. Chellapilla. A scalable pattern mining approach to web graph compression with communities. In WSDM, pages 95-106, Stanford, CA, 2008.
-
(2008)
WSDM
, pp. 95-106
-
-
Buehrer, G.1
Chellapilla, K.2
-
7
-
-
0036040277
-
Similarity estimation techniques from rounding algorithms
-
Montreal, Quebec, Canada
-
M. S. Charikar. Similarity estimation techniques from rounding algorithms. In STOC, pages 380-388, Montreal, Quebec, Canada, 2002.
-
(2002)
STOC
, pp. 380-388
-
-
Charikar, M.S.1
-
8
-
-
0031624833
-
An overview of query optimization in relational systems
-
S. Chaudhuri. An Overview of Query Optimization in Relational Systems. In PODS, pages 34-43, 1998.
-
(1998)
PODS
, pp. 34-43
-
-
Chaudhuri, S.1
-
9
-
-
33749597967
-
A primitive operatior for similarity joins in data cleaning
-
S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operatior for similarity joins in data cleaning. In ICDE, 2006.
-
(2006)
ICDE
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
10
-
-
26444550791
-
Robust identification of fuzzy duplicates
-
Tokyo, Japan
-
S. Chaudhuri, V. Ganti, and R. Motwani. Robust identification of fuzzy duplicates. In ICDE, pages 865-876, Tokyo, Japan, 2005.
-
(2005)
ICDE
, pp. 865-876
-
-
Chaudhuri, S.1
Ganti, V.2
Motwani, R.3
-
11
-
-
70350694219
-
On compressing social networks
-
Paris, France
-
F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan. On compressing social networks. In KDD, pages 219-228, Paris, France, 2009.
-
(2009)
KDD
, pp. 219-228
-
-
Chierichetti, F.1
Kumar, R.2
Lattanzi, S.3
Mitzenmacher, M.4
Panconesi, A.5
Raghavan, P.6
-
12
-
-
50949101828
-
Approximate lexicography and web search
-
K. Church. Approximate lexicography and web search. International Journal of Lexicography, 21(3):325-336, 2008.
-
(2008)
International Journal of Lexicography
, vol.21
, Issue.3
, pp. 325-336
-
-
Church, K.1
-
13
-
-
84936824188
-
Word association norms, mutual information and lexicography
-
K. Church and P. Hanks. Word association norms, mutual information and lexicography. Computational Linguistics, 16(1):22-29, 1991.
-
(1991)
Computational Linguistics
, vol.16
, Issue.1
, pp. 22-29
-
-
Church, K.1
Hanks, P.2
-
14
-
-
0035051307
-
Finding interesting associations without support pruning
-
E. Cohen, M. Datar, S. Fujiwara, A. Gionis, P. Indyk, R. Motwani, J. D. Ullman, and C. Yang. Finding interesting associations without support pruning. IEEE Trans. on Knowl. and Data Eng., 13(1), 2001.
-
(2001)
IEEE Trans. on Knowl. and Data Eng.
, vol.13
, Issue.1
-
-
Cohen, E.1
Datar, M.2
Fujiwara, S.3
Gionis, A.4
Indyk, P.5
Motwani, R.6
Ullman, J.D.7
Yang, C.8
-
15
-
-
70349100410
-
Integration of news content into web results
-
F. Diaz. Integration of News Content into Web Results. In WSDM, 2009.
-
(2009)
WSDM
-
-
Diaz, F.1
-
16
-
-
70349754411
-
Extraction and classification of dense implicit communities in the web graph
-
Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense implicit communities in the web graph. ACM Trans. Web, 3(2):1-36, 2009.
-
(2009)
ACM Trans. Web
, vol.3
, Issue.2
, pp. 1-36
-
-
Dourisboure, Y.1
Geraci, F.2
Pellegrini, M.3
-
17
-
-
84880492977
-
A large-scale study of the evolution of web pages
-
Budapest, Hungary
-
D. Fetterly, M. Manasse, M. Najork, and J. L. Wiener. A large-scale study of the evolution of web pages. In WWW, pages 669-678, Budapest, Hungary, 2003.
-
(2003)
WWW
, pp. 669-678
-
-
Fetterly, D.1
Manasse, M.2
Najork, M.3
Wiener, J.L.4
-
18
-
-
77952264233
-
Efficient detection of large-scale redundancy in enterprise file systems
-
G. Forman, K. Eshghi, and J. Suermondt. Efficient detection of large-scale redundancy in enterprise file systems. SIGOPS Oper. Syst. Rev., 43(1):84-91, 2009.
-
(2009)
SIGOPS Oper. Syst. Rev.
, vol.43
, Issue.1
, pp. 84-91
-
-
Forman, G.1
Eshghi, K.2
Suermondt, J.3
-
19
-
-
84890745998
-
Blews: Using blogs to provide context for news articles
-
M. Gamon, S. Basu, D. Belenko, D. Fisher, M. Hurst, and A. C. König. Blews: Using blogs to provide context for news articles. In AAAI Conference on Weblogs and Social Media, 2008.
-
(2008)
AAAI Conference on Weblogs and Social Media
-
-
Gamon, M.1
Basu, S.2
Belenko, D.3
Fisher, D.4
Hurst, M.5
König, A.C.6
-
20
-
-
0013084629
-
-
Prentice Hall, New York, NY
-
H. Garcia-Molina, J. D. Ullman, and J. Widom. Database Systems: the Complete Book. Prentice Hall, New York, NY, 2002.
-
(2002)
Database Systems: The Complete Book
-
-
Garcia-Molina, H.1
Ullman, J.D.2
Widom, J.3
-
21
-
-
0034818527
-
Efficient and tunable similar set retrieval
-
CA
-
A. Gionis, D. Gunopulos, and N. Koudas. Efficient and tunable similar set retrieval. In SIGMOD, pages 247-258, CA, 2001.
-
(2001)
SIGMOD
, pp. 247-258
-
-
Gionis, A.1
Gunopulos, D.2
Koudas, N.3
-
22
-
-
84865627427
-
An axiomatic approach for result diversification
-
Madrid, Spain
-
S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In WWW, pages 381-390, Madrid, Spain, 2009.
-
(2009)
WWW
, pp. 381-390
-
-
Gollapudi, S.1
Sharma, A.2
-
23
-
-
84862630897
-
Hilbertian metrics and positive definite kernels on probability measures
-
Barbados
-
M. Hein and O. Bousquet. Hilbertian metrics and positive definite kernels on probability measures. In AISTATS, pages 136-143, Barbados, 2005.
-
(2005)
AISTATS
, pp. 136-143
-
-
Hein, M.1
Bousquet, O.2
-
24
-
-
0031644241
-
Approximate nearest neighbors: Towards removing the curse of dimensionality
-
Dallas, TX
-
P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In STOC, pages 604-613, Dallas, TX, 1998.
-
(1998)
STOC
, pp. 604-613
-
-
Indyk, P.1
Motwani, R.2
-
25
-
-
85012195470
-
The history of histograms (abridged)
-
Y. E. Ioannidis. The history of histograms (abridged). In VLDB, 2003.
-
(2003)
VLDB
-
-
Ioannidis, Y.E.1
-
26
-
-
36849003521
-
Towards optimal bag-of-features for object categorization and semantic video retrieval
-
Amsterdam, Netherlands
-
Y. Jiang, C. Ngo, and J. Yang. Towards optimal bag-of-features for object categorization and semantic video retrieval. In CIVR, pages 494-501, Amsterdam, Netherlands, 2007.
-
(2007)
CIVR
, pp. 494-501
-
-
Jiang, Y.1
Ngo, C.2
Yang, J.3
-
27
-
-
42549096144
-
Opinion spam and analysis
-
Palo Alto, California, USA
-
N. Jindal and B. Liu. Opinion spam and analysis. In WSDM, pages 219-230, Palo Alto, California, USA, 2008.
-
(2008)
WSDM
, pp. 219-230
-
-
Jindal, N.1
Liu, B.2
-
28
-
-
44449179989
-
Collaborative data gathering in wireless sensor networks using measurement co-occurrence
-
K. Kalpakis and S. Tang. Collaborative data gathering in wireless sensor networks using measurement co-occurrence. Computer Communications, 31(10):1979-1992, 2008.
-
(2008)
Computer Communications
, vol.31
, Issue.10
, pp. 1979-1992
-
-
Kalpakis, K.1
Tang, S.2
-
29
-
-
72449201596
-
Click-through prediction for news queries
-
A. C. König, M. Gamon, and Q. Wu. Click-Through Prediction for News Queries. In SIGIR, 2009.
-
(2009)
SIGIR
-
-
König, A.C.1
Gamon, M.2
Wu, Q.3
-
30
-
-
77957718350
-
Power-law based estimation of set similarity join size
-
H. Lee, R. T. Ng, and K. Shim. Power-law based estimation of set similarity join size. In PVLDB, 2009.
-
(2009)
PVLDB
-
-
Lee, H.1
Ng, R.T.2
Shim, K.3
-
31
-
-
34748825544
-
A sketch algorithm for estimating two-way and multi-way associations
-
Preliminary results appeared in HLT/EMNLP 2005
-
P. Li and K. W. Church. A sketch algorithm for estimating two-way and multi-way associations. Computational Linguistics, 33(3):305-354, 2007 (Preliminary results appeared in HLT/EMNLP 2005).
-
(2007)
Computational Linguistics
, vol.33
, Issue.3
, pp. 305-354
-
-
Li, P.1
Church, K.W.2
-
32
-
-
85144329977
-
Conditional random sampling: A sketch-based sampling technique for sparse data
-
Vancouver, BC, Canada
-
P. Li, K. W. Church, and T. J. Hastie. Conditional random sampling: A sketch-based sampling technique for sparse data. In NIPS, pages 873-880, Vancouver, BC, Canada, 2006.
-
(2006)
NIPS
, pp. 873-880
-
-
Li, P.1
Church, K.W.2
Hastie, T.J.3
-
33
-
-
85162025458
-
One sketch for all: Theory and applications of conditional random sampling
-
Vancouver, BC, Canada
-
P. Li, K. W. Church, and T. J. Hastie. One sketch for all: Theory and applications of conditional random sampling. In NIPS, Vancouver, BC, Canada, 2008.
-
(2008)
NIPS
-
-
Li, P.1
Church, K.W.2
Hastie, T.J.3
-
34
-
-
33746094275
-
Improving random projections using marginal information
-
Pittsburgh, PA
-
P. Li, T. J. Hastie, and K. W. Church. Improving random projections using marginal information. In COLT, pages 635-649, Pittsburgh, PA, 2006.
-
(2006)
COLT
, pp. 635-649
-
-
Li, P.1
Hastie, T.J.2
Church, K.W.3
-
35
-
-
77954568754
-
B-bit minwise hashing
-
Raleigh, NC
-
P. Li and A. C. König. b-bit minwise hashing. In WWW, pages 671-680, Raleigh, NC, 2010.
-
(2010)
WWW
, pp. 671-680
-
-
Li, P.1
König, A.C.2
-
36
-
-
77954589225
-
Probabilistic frequent itemset mining in uncertain databases
-
Paris, France
-
Ludmila, K. Eshghi, C. B. M. III, J. Tucek, and A. Veitch. Probabilistic frequent itemset mining in uncertain databases. In KDD, pages 1087-1096, Paris, France, 2009.
-
(2009)
KDD
, pp. 1087-1096
-
-
Ludmila, K.1
Eshghi III, C.B.M.2
Tucek, J.3
Veitch, A.4
-
37
-
-
35348911985
-
Detecting near-duplicates for web-crawling
-
Banff, Alberta, Canada
-
G. S. Manku, A. Jain, and A. D. Sarma. Detecting Near-Duplicates for Web-Crawling. In WWW, Banff, Alberta, Canada, 2007.
-
(2007)
WWW
-
-
Manku, G.S.1
Jain, A.2
Sarma, A.D.3
-
39
-
-
70349087322
-
Less is more: Sampling the neighborhood graph makes salsa better and faster
-
Barcelona, Spain
-
M. Najork, S. Gollapudi, and R. Panigrahy. Less is more: sampling the neighborhood graph makes salsa better and faster. In WSDM, pages 242-251, Barcelona, Spain, 2009.
-
(2009)
WSDM
, pp. 242-251
-
-
Najork, M.1
Gollapudi, S.2
Panigrahy, R.3
-
40
-
-
3142777876
-
Efficient set joins on similarity predicates
-
S. Sarawagi and A. Kirpal. Efficient set joins on similarity predicates. In SIGMOD, pages 743-754, 2004.
-
(2004)
SIGMOD
, pp. 743-754
-
-
Sarawagi, S.1
Kirpal, A.2
-
41
-
-
40949104348
-
Tracking web spam with html style similarities
-
T. Urvoy, E. Chauveau, P. Filoche, and T. Lavergne. Tracking web spam with html style similarities. ACM Trans. Web, 2(1):1-28, 2008.
-
(2008)
ACM Trans. Web
, vol.2
, Issue.1
, pp. 1-28
-
-
Urvoy, T.1
Chauveau, E.2
Filoche, P.3
Lavergne, T.4
-
42
-
-
67650083615
-
Mining term association patterns from search logs for effective query reformulation
-
Napa Valley, California, USA
-
X. Wang and C. Zhai. Mining term association patterns from search logs for effective query reformulation. In CIKM, pages 479-488, Napa Valley, California, USA, 2008.
-
(2008)
CIKM
, pp. 479-488
-
-
Wang, X.1
Zhai, C.2
|