-
1
-
-
10044242279
-
Challenges in information retrieval and language modeling: Report of a workshop held at the center for intelligent information retrieval
-
University of Massachusetts Amherst, September 2002
-
J. Allan et al. Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002. SIGIR Forum, 37(1):31-47, 2003.
-
(2003)
SIGIR Forum
, vol.37
, Issue.1
, pp. 31-47
-
-
Allan, J.1
-
3
-
-
33745784229
-
Redundant documents and search effectiveness
-
Bremen, Germany
-
Y. Bernstein and J. Zobel. Redundant documents and search effectiveness. In Proc. ACM CIKM Conf., pages 736-743, Bremen, Germany, 2005.
-
(2005)
Proc. ACM CIKM Conf.
, pp. 736-743
-
-
Bernstein, Y.1
Zobel, J.2
-
4
-
-
84976810280
-
Copy detection mechanisms for digital documents
-
San Jose, California
-
S. Brin, J. Davis, and H. García-Molina. Copy detection mechanisms for digital documents. In Proc. ACM SIGMOD international conference on Management of Data, pages 398-409, San Jose, California, 1995.
-
(1995)
Proc. ACM SIGMOD International Conference on Management of Data
, pp. 398-409
-
-
Brin, S.1
Davis, J.2
García-Molina, H.3
-
5
-
-
0010362121
-
Syntactic clustering of the web
-
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the web. Computer Networks and ISDN Systems, 29(8-13):1157-1166, 1997.
-
(1997)
Computer Networks and ISDN Systems
, vol.29
, Issue.8-13
, pp. 1157-1166
-
-
Broder, A.Z.1
Glassman, S.C.2
Manasse, M.S.3
Zweig, G.4
-
6
-
-
0031620041
-
Min-wise independent permutations
-
New York, NY, USA. ACM Press. ISBN 0-89791-962-9
-
A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise independent permutations (extended abstract). In Proc. ACM symposium on Theory of computing (STOC), pages 327-336, New York, NY, USA, 1998. ACM Press. ISBN 0-89791-962-9.
-
(1998)
Proc. ACM Symposium on Theory of Computing (STOC)
, pp. 327-336
-
-
Broder, A.Z.1
Charikar, M.2
Frieze, A.M.3
Mitzenmacher, M.4
-
8
-
-
0029193309
-
Searching distributed collections with inference networks
-
Seattle, Washington
-
J. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. In Proc. Int. ACM-SIGIR Conf., pages 21-28, Seattle, Washington, 1995.
-
(1995)
Proc. Int. ACM-SIGIR Conf.
, pp. 21-28
-
-
Callan, J.1
Lu, Z.2
Croft, W.B.3
-
9
-
-
0013206133
-
Collection statistics for fast duplicate document detection
-
A. Chowdhury, O. Frieder, D. Grossman, and M. C. McCabe. Collection statistics for fast duplicate document detection. ACM Transactions on Information Systems, 20 (2):171-191, 2002.
-
(2002)
ACM Transactions on Information Systems
, vol.20
, Issue.2
, pp. 171-191
-
-
Chowdhury, A.1
Frieder, O.2
Grossman, D.3
McCabe, M.C.4
-
10
-
-
12244271239
-
Online duplicate document detection: Signature reliability in a dynamic retrieval environment
-
New Orleans, Louisiana
-
J. G. Conrad, X. S. Guo, and C. P. Schriber. Online duplicate document detection: Signature reliability in a dynamic retrieval environment. In Proc. ACM-CIKM Conf., pages 443-452, New Orleans, Louisiana, 2003.
-
(2003)
Proc. ACM-CIKM Conf.
, pp. 443-452
-
-
Conrad, J.G.1
Guo, X.S.2
Schriber, C.P.3
-
11
-
-
0037481029
-
Detecting similar documents using salient terms
-
McLean, Virginia
-
J. W. Cooper, A. R. Coden, and E. W. Brown. Detecting similar documents using salient terms. In Proc. ACM-CIKM Conf., pages 245-251, McLean, Virginia, 2002.
-
(2002)
Proc. ACM-CIKM Conf.
, pp. 245-251
-
-
Cooper, J.W.1
Coden, A.R.2
Brown, E.W.3
-
13
-
-
33947178503
-
ProFusion: Intelligent fusion from multiple, distributed search engines
-
S. Gauch, G. Wang, and M. Gomez. ProFusion: Intelligent fusion from multiple, distributed search engines. J. Universal Computer Science, 2(9):637-649, 1996.
-
(1996)
J. Universal Computer Science
, vol.2
, Issue.9
, pp. 637-649
-
-
Gauch, S.1
Wang, G.2
Gomez, M.3
-
14
-
-
0031165871
-
STARTS: Stanford proposal for Internet meta-searching
-
Tucson, Arizona
-
L. Gravano, C. K. Chang, H. Garcia-Molina, and A. Paepcke. STARTS: Stanford proposal for Internet meta-searching. In Proc. ACM SIGMOD international conference on Management of Data, pages 207-218, Tucson, Arizona, 1997.
-
(1997)
Proc. ACM SIGMOD International Conference on Management of Data
, pp. 207-218
-
-
Gravano, L.1
Chang, C.K.2
Garcia-Molina, H.3
Paepcke, A.4
-
15
-
-
0027727494
-
Overview of the first TREC conference
-
Pittsburgh, Pennsylvania
-
D. Harman. Overview of the first TREC conference. In Proc. ACM-SIGIR Conf., pages 36-47, Pittsburgh, Pennsylvania, 1993.
-
(1993)
Proc. ACM-SIGIR Conf.
, pp. 36-47
-
-
Harman, D.1
-
16
-
-
77953053895
-
Improving text collection selection with coverage and overlap statistics
-
Chiba, Japan
-
T. Hernandez and S. Kambhampati. Improving text collection selection with coverage and overlap statistics. In Proc. Int. Conf. on World Wide Web, pages 1128-1129, Chiba, Japan, 2005.
-
(2005)
Proc. Int. Conf. on World Wide Web
, pp. 1128-1129
-
-
Hernandez, T.1
Kambhampati, S.2
-
18
-
-
12244302670
-
An efficient method to detect duplicates of web documents with the use of inverted index
-
Honolulu, Hawaii
-
S. Ilyinski, M. Kuzmin, A. Melkov, and I. Segalovich. An efficient method to detect duplicates of web documents with the use of inverted index. In Proc, Int. Conf. on World Wide Web, Honolulu, Hawaii, 2002.
-
(2002)
Proc, Int. Conf. on World Wide Web
-
-
Ilyinski, S.1
Kuzmin, M.2
Melkov, A.3
Segalovich, I.4
-
19
-
-
12244261882
-
Improved robustness of signature-based near-replica detection via lexicon randomization
-
Seattle, WA
-
A. Kolcz, A. Chowdhury, and J. Alspector. Improved robustness of signature-based near-replica detection via lexicon randomization. In Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 605-610, Seattle, WA, 2004.
-
(2004)
Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
, pp. 605-610
-
-
Kolcz, A.1
Chowdhury, A.2
Alspector, J.3
-
21
-
-
85043988965
-
Finding similar files in a large file system
-
San Fransisco, CA, 17-21
-
U. Manber. Finding similar files in a large file system. In Proc. USENIX Winter Technical Conf., pages 1-10, San Fransisco, CA, 17-21 1994.
-
(1994)
Proc. USENIX Winter Technical Conf.
, pp. 1-10
-
-
Manber, U.1
-
22
-
-
0038544393
-
Building efficient and effective metasearch engines
-
W. Meng, C. Yu, and K. Liu. Building efficient and effective metasearch engines. ACM Computing Surveys, 34(1):48-89, 2002.
-
(2002)
ACM Computing Surveys
, vol.34
, Issue.1
, pp. 48-89
-
-
Meng, W.1
Yu, C.2
Liu, K.3
-
23
-
-
1542317683
-
Evaluating different methods of estimating retrieval quality for resource selection
-
Toronto, Canada
-
H. Nottelmann and N. Fuhr. Evaluating different methods of estimating retrieval quality for resource selection. In Proc. Int. ACM-SIGIR Conf., pages 290-297, Toronto, Canada, 2003.
-
(2003)
Proc. Int. ACM-SIGIR Conf.
, pp. 290-297
-
-
Nottelmann, H.1
Fuhr, N.2
-
24
-
-
2442500342
-
Comparing the performance of collection selection algorithms
-
A. L. Powell and J. French. Comparing the performance of collection selection algorithms. ACM Transactions on Information Systems, 21(4):412-456, 2003.
-
(2003)
ACM Transactions on Information Systems
, vol.21
, Issue.4
, pp. 412-456
-
-
Powell, A.L.1
French, J.2
-
25
-
-
33750297403
-
Detecting duplicate and near-duplicate files
-
United States Patent 6,658,423
-
W. Pugh and M. H. Henzinger. Detecting duplicate and near-duplicate files (United States Patent 6,658,423), 2003.
-
(2003)
-
-
Pugh, W.1
Henzinger, M.H.2
-
26
-
-
33748731480
-
The MetaCrawler architecture for resource aggregation on the Web
-
E. Selberg and O. Etzioni. The MetaCrawler architecture for resource aggregation on the Web. IEEE Expert, (January-February): 11-14, 1997.
-
(1997)
IEEE Expert
, Issue.JANUARY-FEBRUARY
, pp. 11-14
-
-
Selberg, E.1
Etzioni, O.2
-
27
-
-
18744392825
-
Unified utility maximization framework for resource selection
-
Washington, D.C.
-
L. Si and J. Callan. Unified utility maximization framework for resource selection. In Proc. ACM-CIKM Conf., pages 32-41, Washington, D.C., 2004.
-
(2004)
Proc. ACM-CIKM Conf.
, pp. 32-41
-
-
Si, L.1
Callan, J.2
-
28
-
-
1542347745
-
Relevant document distribution estimation method for resource selection
-
Toronto, Canada
-
L. Si and J. Callan. Relevant document distribution estimation method for resource selection. In Proc. ACM-SIGIR Conf., pages 298-305, Toronto, Canada, 2003.
-
(2003)
Proc. ACM-SIGIR Conf.
, pp. 298-305
-
-
Si, L.1
Callan, J.2
-
30
-
-
0033294891
-
Grouper: A dynamic clustering interface to web search results
-
Toronto, Canada
-
O. Zamir and O. Etzioni. Grouper: a dynamic clustering interface to web search results. In Proc. Int. Conf. on World Wide Web, pages 1361-1374, Toronto, Canada, 1999.
-
(1999)
Proc. Int. Conf. on World Wide Web
, pp. 1361-1374
-
-
Zamir, O.1
Etzioni, O.2
-
31
-
-
33745661252
-
The case of the duplicate documents: Measurement, search, and science
-
Harbin, China
-
J. Zobel and Y. Bernstein. The case of the duplicate documents: Measurement, search, and science. In Proc. Asia-Pacific Web Con}., pages 26-39, Harbin, China, 2006.
-
(2006)
Proc. Asia-pacific Web Con}.
, pp. 26-39
-
-
Zobel, J.1
Bernstein, Y.2
|