-
1
-
-
10044242279
-
-
University of Massachusetts Amherst, september 2002, held at the center for intelligent information retrieval
-
Allan et al, J. (2003), 'Challenges in information retrieval and language modeling: report of aworkshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, september 2002', SIGIR Forum 37(1), 31-47.
-
(2003)
SIGIR Forum
, vol.37
, Issue.1
, pp. 31-47
-
-
Allan et al, J.1
-
2
-
-
33644537094
-
The FedLe- mur: federated search in the real world
-
Avrahami, T., Yau, L., Si, L. & Callan, J. (2006), 'The FedLe- mur: federated search in the real world', Journal of the American Society for Information Science and Technology 57(3), 347-358.
-
(2006)
Journal of the American Society for Information Science and Technology
, vol.57
, Issue.3
, pp. 347-358
-
-
Avrahami, T.1
Yau, L.2
Si, L.3
Callan, J.4
-
4
-
-
67650896026
-
Compact features for detection of near-duplicates in distributed retrieval
-
Glasgow, Schotland.
-
Bernstein, Y., Shokouhi, M. & Zobel, J. (2006), Compact features for detection of near-duplicates in distributed retrieval, in 'Proceedings of String Processing and Information Retrieval Symposium (to appear)', Glasgow, Schotland.
-
(2006)
Proceedings of String Processing and Information Retrieval Symposium (to appear)
-
-
Bernstein, Y.1
Shokouhi, M.2
Zobel, J.3
-
5
-
-
84871101442
-
A scalable system for identifying co-derivative documents
-
Padova, Italy
-
Bernstein, Y. & Zobel, J. (2004), A scalable system for identifying co-derivative documents, in 'Proceedings of String Processing and Information Retrieval Symposium', Padova, Italy, pp. 55-67.
-
(2004)
Proceedings of String Processing and Information Retrieval Symposium
, pp. 55-67
-
-
Bernstein, Y.1
Zobel, J.2
-
6
-
-
33745784229
-
Redundant documents and search effectiveness
-
Bremen, Germany
-
Bernstein, Y. & Zobel, J. (2005), Redundant documents and search effectiveness, in 'Proceedings of 14th ACM CIKM Conference on Information and Knowledge Management', Bremen, Germany, pp. 736-743.
-
(2005)
Proceedings of 14th ACM CIKM Conference on Information and Knowledge Management
, pp. 736-743
-
-
Bernstein, Y.1
Zobel, J.2
-
7
-
-
84976810280
-
Copy detection mechanisms for digital documents
-
San Jose, California
-
Brin, S., Davis, J. & Garćia-Molina, H. (1995), Copy detection mechanisms for digital documents, in 'Proceedings of ACM SIGMOD international conference on Management of Data', San Jose, California, pp. 398-409.
-
(1995)
Proceedings of ACM SIGMOD international conference on Management of Data
, pp. 398-409
-
-
Brin, S.1
Davis, J.2
Garćia-Molina, H.3
-
8
-
-
0010362121
-
Syntactic clustering of the web
-
Broder, A. Z., Glassman, S. C., Manasse, M. S. & Zweig, G. (1997), 'Syntactic clustering of the web', Computer Networks and ISDN Systems 29(8-13), 1157-1166.
-
(1997)
Computer Networks and ISDN Systems
, vol.29
, Issue.8-13
, pp. 1157-1166
-
-
Broder, A.Z.1
Glassman, S.C.2
Manasse, M.S.3
Zweig, G.4
-
9
-
-
0002104204
-
Query-based sampling of text databases
-
Callan, J. & Connell, M. (2001), 'Query-based sampling of text databases', ACM Transactions on Information Systems 19(2), 97-130.
-
(2001)
ACM Transactions on Information Systems
, vol.19
, Issue.2
, pp. 97-130
-
-
Callan, J.1
Connell, M.2
-
10
-
-
0347131511
-
Automatic discovery of language models for text databases
-
Philadelphia, Pennsylvania
-
Callan, J., Connell, M. & Du, A. (1999), Automatic discovery of language models for text databases, in 'Proceedings of ACM SIGMOD International Conference on Management of Data', Philadelphia, Pennsylvania, pp. 479-490.
-
(1999)
Proceedings of ACM SIGMOD International Conference on Management of Data
, pp. 479-490
-
-
Callan, J.1
Connell, M.2
Du, A.3
-
11
-
-
0002898767
-
The IN- QUERY retrieval system
-
Valencia, Spain
-
Callan, J., Croft, W. B. & Harding, S. M. (1992), The IN- QUERY retrieval system, in 'Proceedings of third International Conference on Database and Expert Systems Applications', Valencia, Spain, pp. 78-83.
-
(1992)
Proceedings of third International Conference on Database and Expert Systems Applications
, pp. 78-83
-
-
Callan, J.1
Croft, W.B.2
Harding, S.M.3
-
12
-
-
0029193309
-
Searching distributed collections with inference networks
-
Seattle, Washington
-
Callan, J., Lu, Z. & Croft, W. B. (1995), Searching distributed collections with inference networks, in 'Proceedings of 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval', Seattle, Washington, pp. 21-28.
-
(1995)
Proceedings of 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, pp. 21-28
-
-
Callan, J.1
Lu, Z.2
Croft, W.B.3
-
13
-
-
0013206133
-
Collection statistics for fast duplicate document detection
-
Chowdhury, A., Frieder, O., Grossman, D. & McCabe, M. C. (2002), 'Collection statistics for fast duplicate document detection', ACM Transactions on Information Systems 20(2), 171-191.
-
(2002)
ACM Transactions on Information Systems
, vol.20
, Issue.2
, pp. 171-191
-
-
Chowdhury, A.1
Frieder, O.2
Grossman, D.3
McCabe, M.C.4
-
14
-
-
12244271239
-
Online duplicate document detection: Signature reliability in a dynamic retrieval environment
-
New Orleans, Louisiana
-
Conrad, J. G., Guo, X. S. & Schriber, C. P. (2003), Online duplicate document detection: Signature reliability in a dynamic retrieval environment, in 'Proceedings of 12th ACM CIKM Conference on Information and Knowledge Management', New Orleans, Louisiana, pp. 443-452.
-
(2003)
Proceedings of 12th ACM CIKM Conference on Information and Knowledge Management
, pp. 443-452
-
-
Conrad, J.G.1
Guo, X.S.2
Schriber, C.P.3
-
15
-
-
0037481029
-
Detecting similar documents using salient terms
-
McLean, Virginia
-
Cooper, J. W., Coden, A. R. & Brown, E.W. (2002), Detecting similar documents using salient terms, in 'Proceedings of 11th ACM CIKM Conference on Information and Knowledge Management', McLean, Virginia, pp. 245-251.
-
(2002)
Proceedings of 11th ACM CIKM Conference on Information and Knowledge Management
, pp. 245-251
-
-
Cooper, J.W.1
Coden, A.R.2
Brown, E.W.3
-
16
-
-
0033653017
-
Server selection on the World Wide Web
-
San Antonio, Texas
-
Craswell, N., Bailey, P. & Hawking, D. (2000), Server selection on the World Wide Web, in 'Proceedings of Fifth ACM Conference on Digital Libraries', San Antonio, Texas, pp. 37-46.
-
(2000)
Proceedings of Fifth ACM Conference on Digital Libraries
, pp. 37-46
-
-
Craswell, N.1
Bailey, P.2
Hawking, D.3
-
17
-
-
8644280055
-
Overview of the TREC-2002 Web Track
-
Gaithersburg, Maryland.
-
Craswell, N. & Hawking, D. (2002), Overview of the TREC-2002 Web Track, in 'Proceedings of TREC-2002', Gaithersburg, Maryland.
-
(2002)
Proceedings of TREC-2002
-
-
Craswell, N.1
Hawking, D.2
-
18
-
-
0002818648
-
Combining approaches to information retrieval
-
chapter 1
-
Croft, B. (2000), 'Combining approaches to information retrieval', Advances in information retrieval, chapter 1 pp. 1-36.
-
(2000)
Advances in information retrieval
, pp. 1-36
-
-
Croft, B.1
-
19
-
-
33745649335
-
Is CORI effective for collection selection? an exploration of parameers, queries, and data
-
Melbourne, Australia
-
D'Souza, D., Zobel, J. & Thom, J. (2004), Is CORI effective for collection selection? an exploration of parameers, queries, and data, in 'Proceedings of Australian Document Computing Symposium', Melbourne, Australia, pp. 41-46.
-
(2004)
Proceedings of Australian Document Computing Symposium
, pp. 41-46
-
-
D'Souza, D.1
Zobel, J.2
Thom, J.3
-
20
-
-
84945137687
-
On the evo- lution of clusters of near-duplicate web pages
-
IEEE
-
Fetterly, D., Manasse, M. & Najork, M. (2003), On the evo- lution of clusters of near-duplicate web pages, in 'Proceedings of first Latin American Web Congress', IEEE, pp. 37-45.
-
(2003)
Proceedings of first Latin American Web Congress
, pp. 37-45
-
-
Fetterly, D.1
Manasse, M.2
Najork, M.3
-
21
-
-
0000824946
-
Combination of multiple searches, in 'Proceedings of TREC-1994
-
NIST Special Publication, Gaithersburg, Maryland
-
Fox, E. & Shaw, J. (1994), Combination of multiple searches, in 'Proceedings of TREC-1994', NIST Special Publication, Gaithersburg, Maryland, pp. 105-108.
-
(1994)
, pp. 105-108
-
-
Fox, E.1
Shaw, J.2
-
22
-
-
33947178503
-
ProFusion: Intelligent fusion from multipl
-
Gauch, S., Wang, G. & Gomez, M. (1996), 'ProFusion: Intelligent fusion from multiple, distributed search engines', Journal Universal Computer Science 2(9), 637-649.
-
(1996)
distributed search engines', Journal Universal Computer Science
, vol.2
, Issue.9
, pp. 637-649
-
-
Gauch, S.1
Wang, G.2
Gomez, M.3
-
23
-
-
0031165871
-
STARTS: Stanford proposal for Internet metasearching
-
Tucson, Arizona
-
Gravano, L., Chang, C. K., Garcia-Molina, H. & Paepcke, A. (1997), STARTS: Stanford proposal for Internet metasearching, in 'Proceedings of ACM SIGMOD International Conference on Management of Data', Tucson, Arizona, pp. 207-218.
-
(1997)
Proceedings of ACM SIGMOD International Conference on Management of Data
, pp. 207-218
-
-
Gravano, L.1
Chang, C.K.2
Garcia-Molina, H.3
Paepcke, A.4
-
24
-
-
0001511080
-
GlOSS: text-source discovery over the Internet
-
Gravano, L., Garcia-Molina, H. & Tomasic, A. (1999), 'GlOSS: text-source discovery over the Internet', ACM Transactions on Database Systems 24(2), 229-264.
-
(1999)
ACM Transactions on Database Systems
, vol.24
, Issue.2
, pp. 229-264
-
-
Gravano, L.1
Garcia-Molina, H.2
Tomasic, A.3
-
25
-
-
84885572144
-
Server selection methods in hybrid portal search
-
Salvador, Brazil
-
Hawking, D. & Thomas, P. (2005), Server selection methods in hybrid portal search, in 'Proceedings of 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval', Salvador, Brazil, pp. 75-82.
-
(2005)
Proceedings of 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, pp. 75-82
-
-
Hawking, D.1
Thomas, P.2
-
27
-
-
0037319544
-
Methods for identifying versioned and plagiarised documents
-
Hoad, T. C. & Zobel, J. (2003), 'Methods for identifying versioned and plagiarised documents', Journal of the American Society for Information Science and Technology 54(3), 203-215.
-
(2003)
Journal of the American Society for Information Science and Technology
, vol.54
, Issue.3
, pp. 203-215
-
-
Hoad, T.C.1
Zobel, J.2
-
28
-
-
12244302670
-
An efficient method to detect duplicates of web documents with the use of inverted index
-
Honolulu, Hawaii.
-
Ilyinski, S., Kuzmin, M., Melkov, A. & Segalovich, I. (2002), An efficient method to detect duplicates of web documents with the use of inverted index, in 'Proceedings of 11th International Conference on World Wide Web', Honolulu, Hawaii.
-
(2002)
Proceedings of 11th International Conference on World Wide Web
-
-
Ilyinski, S.1
Kuzmin, M.2
Melkov, A.3
Segalovich, I.4
-
29
-
-
2442626107
-
Distributed search over the hidden Web: Hierarchical database sampling and selection
-
Hong Kong, China
-
Ipeirotis, P. G. & Gravano, L. (2002), Distributed search over the hidden Web: Hierarchical database sampling and selection., in 'Proceedings of 28th International Conference on Very Large Data Bases', Hong Kong, China, pp. 394-405.
-
(2002)
Proceedings of 28th International Conference on Very Large Data Bases
, pp. 394-405
-
-
Ipeirotis, P.G.1
Gravano, L.2
-
30
-
-
3142691239
-
When one sample is not enough: improving text database selection using shrinkage
-
Paris, France
-
Ipeirotis, P. G. & Gravano, L. (2004), When one sample is not enough: improving text database selection using shrinkage, in 'Proceedings of ACM SIGMOD International Conference on Management of Data', Paris, France, pp. 767-778.
-
(2004)
Proceedings of ACM SIGMOD International Conference on Management of Data
, pp. 767-778
-
-
Ipeirotis, P.G.1
Gravano, L.2
-
32
-
-
12244261882
-
Improved robustness of signature-based near-replica detection via lexicon randomization
-
Seattle, WA
-
Kolcz, A., Chowdhury, A. & Alspector, J. (2004), Improved robustness of signature-based near-replica detection via lexicon randomization, in 'Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining', Seattle, WA, pp. 605-610.
-
(2004)
Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, pp. 605-610
-
-
Kolcz, A.1
Chowdhury, A.2
Alspector, J.3
-
33
-
-
0030657238
-
Analyses of multiple evidence combination
-
Philadelphia, Pennsylvania, United States
-
Lee, J. (1997), Analyses of multiple evidence combination, in 'Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval', Philadelphia, Pennsylvania, United States, pp. 267-276.
-
(1997)
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
, pp. 267-276
-
-
Lee, J.1
-
34
-
-
85126922087
-
Detecting short passages of similar text in large document collections
-
Philadelphia, Pennsylvania.
-
Lyon, C., Malcolm, J. & Dickerson, B. (2001), Detecting short passages of similar text in large document collections, in 'Proceedings of Conference on Empirical Methods in Natural Language Processing', Philadelphia, Pennsylvania.
-
(2001)
Proceedings of Conference on Empirical Methods in Natural Language Processing
-
-
Lyon, C.1
Malcolm, J.2
Dickerson, B.3
-
35
-
-
85043988965
-
Finding similar files in a large file system
-
San Fransisco, CA
-
Manber, U. (1994), Finding similar files in a large file system, in 'Proceedings of USENIXWinter Technical Conference', San Fransisco, CA, pp. 1-10.
-
(1994)
Proceedings of USENIXWinter Technical Conference
, pp. 1-10
-
-
Manber, U.1
-
36
-
-
0038544393
-
Building efficient and effective metasearch engines
-
Meng, W., Yu, C. & Liu, K. (2002), 'Building efficient and effective metasearch engines', ACM Computing Surveys 34(1), 48-89.
-
(2002)
ACM Computing Surveys
, vol.34
, Issue.1
, pp. 48-89
-
-
Meng, W.1
Yu, C.2
Liu, K.3
-
37
-
-
1542317683
-
Evaluating different methods of estimating retrieval quality for resource selection
-
Toronto, Canada
-
Nottelmann, H. & Fuhr, N. (2003), Evaluating different methods of estimating retrieval quality for resource selection, in 'Proceedings of 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval', Toronto, Canada, pp. 290-297.
-
(2003)
Proceedings of 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, pp. 290-297
-
-
Nottelmann, H.1
Fuhr, N.2
-
38
-
-
2442500342
-
Comparing the performance of collection selection algorithms
-
Powell, A. L. & French, J. (2003), 'Comparing the performance of collection selection algorithms', ACM Transactions on Information Systems 21(4), 412-456.
-
(2003)
ACM Transactions on Information Systems
, vol.21
, Issue.4
, pp. 412-456
-
-
Powell, A.L.1
French, J.2
-
40
-
-
0035751081
-
Approaches to collection selection and results merging for distributed information retrieval
-
Atlanta, Georgia
-
Rasolofo, Y., Abbaci, F. & Savoy, J. (2001), Approaches to collection selection and results merging for distributed information retrieval, in 'Proceedings of 10th ACM CIKM International Conference on Information and knowledge management', Atlanta, Georgia, pp. 191-198.
-
(2001)
Proceedings of 10th ACM CIKM International Conference on Information and knowledge management
, pp. 191-198
-
-
Rasolofo, Y.1
Abbaci, F.2
Savoy, J.3
-
41
-
-
33748731480
-
The MetaCrawler architecture for resource aggregation on the web
-
Selberg, E. & Etzioni, O. (1997), 'The MetaCrawler architecture for resource aggregation on the web', IEEE Expert 12(1), 8-14.
-
(1997)
IEEE Expert
, vol.12
, Issue.1
, pp. 8-14
-
-
Selberg, E.1
Etzioni, O.2
-
42
-
-
33745637498
-
Sample sizes for query probing in uncooperative distributed information retrieval
-
Harbin, China
-
Shokouhi, M., Scholer, F. & Zobel, J. (2006), Sample sizes for query probing in uncooperative distributed information retrieval, in 'Proceedings of of Eighth Asia Pacific Web Conference', Harbin, China, pp. 63-75.
-
(2006)
Proceedings of of Eighth Asia Pacific Web Conference
, pp. 63-75
-
-
Shokouhi, M.1
Scholer, F.2
Zobel, J.3
-
43
-
-
1542347745
-
Relevant document distribution estimation method for resource selection
-
Toronto, Canada
-
Si, L. & Callan, J. (2003a), Relevant document distribution estimation method for resource selection, in 'Proceedings of 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval', Toronto, Canada, pp. 298-305.
-
(2003)
Proceedings of 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, pp. 298-305
-
-
Si, L.1
Callan, J.2
-
44
-
-
2442515614
-
A semisupervised learning method to merge search engine results
-
Si, L. & Callan, J. (2003b), 'A semisupervised learning method to merge search engine results', ACM Transactions on Information Systems 21(4), 457-491.
-
(2003)
ACM Transactions on Information Systems
, vol.21
, Issue.4
, pp. 457-491
-
-
Si, L.1
Callan, J.2
-
45
-
-
18744392825
-
Unified utility maximization frame- work for resource selection
-
Washington, D.C.
-
Si, L. & Callan, J. (2004), Unified utility maximization frame- work for resource selection, in 'Proceedings of 13th ACM CIKM Conference on Information and Knowledge Management', Washington, D.C., pp. 32-41.
-
(2004)
Proceedings of 13th ACM CIKM Conference on Information and Knowledge Management
, pp. 32-41
-
-
Si, L.1
Callan, J.2
-
47
-
-
0037480958
-
A language modeling framework for resource selection and results merging
-
New York, NY
-
Si, L., Jin, R., Callan, J. & Ogilvie, P. (2002), A language modeling framework for resource selection and results merging, in 'Proceedings of 11th ACM CIKM International Conference on Information and Knowledge Management', New York, NY, pp. 391-397.
-
(2002)
Proceedings of 11th ACM CIKM International Conference on Information and Knowledge Management
, pp. 391-397
-
-
Si, L.1
Jin, R.2
Callan, J.3
Ogilvie, P.4
-
48
-
-
0033705619
-
Query routing for web search engines: architectures and experiments
-
North-Holland Publishing Co., Amsterdam, The Netherlands
-
Sugiura, A. & Etzioni, O. (2000), Query routing for web search engines: architectures and experiments, in 'Proceedings of the 9th international World Wide Web conference on Computer networks', North-Holland Publishing Co., Amsterdam, The Netherlands, pp. 417-429.
-
(2000)
Proceedings of the 9th international World Wide Web conference on Computer networks
, pp. 417-429
-
-
Sugiura, A.1
Etzioni, O.2
-
49
-
-
84872854707
-
Result merging methods in distributed information retrieval with overlapping databases
-
(In press).
-
Wu, S. & McClean, S. (2006), 'Result merging methods in distributed information retrieval with overlapping databases', Journal of Information Retrieval (In press).
-
(2006)
Journal of Information Retrieval
-
-
Wu, S.1
McClean, S.2
-
50
-
-
0032275565
-
Effective retrieval with distributed collections
-
Melbourne, Australia
-
Xu, J. & Callan, J. (1998), Effective retrieval with distributed collections, in 'Proceedings of 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval', Melbourne, Australia, pp. 112-120.
-
(1998)
Proceedings of 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, pp. 112-120
-
-
Xu, J.1
Callan, J.2
-
51
-
-
0002326831
-
Server ranking for distributed text retrieval systems on the internet
-
World Scientific Press, Melbourne, Australia
-
Yuwono, B. & Lee, D. L. (1997), Server ranking for distributed text retrieval systems on the internet, in 'Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)', World Scientific Press, Melbourne, Australia, pp. 41-50.
-
(1997)
Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
, pp. 41-50
-
-
Yuwono, B.1
Lee, D.L.2
-
52
-
-
0033294891
-
Grouper: a dynamic cluster- ing interface to web search results
-
Toronto, Canada
-
Zamir, O. & Etzioni, O. (1999), Grouper: a dynamic cluster- ing interface to web search results, in 'Proceedings of 8th International Conference on World Wide Web', Toronto, Canada, pp. 1361-1374.
-
(1999)
Proceedings of 8th International Conference on World Wide Web
, pp. 1361-1374
-
-
Zamir, O.1
Etzioni, O.2
-
53
-
-
1842841628
-
Collection selection via lexicon inspection
-
in P. Bruza, ed., 'Proceedings of the Australian Document Computing Symposium'
-
Zobel, J. (1997), Collection selection via lexicon inspection, in P. Bruza, ed., 'Proceedings of the Australian Document Computing Symposium', pp. 74-80.
-
(1997)
, pp. 74-80
-
-
Zobel, J.1
-
54
-
-
33745661252
-
The case of the duplicate documents: Measurement, search, and science
-
Harbin, China
-
Zobel, J. & Bernstein, Y. (2006), The case of the duplicate documents: Measurement, search, and science, in 'Proceedings of of Eighth Asia Pacific Web Conference', Harbin, China, pp. 26-39.
-
(2006)
Proceedings of of Eighth Asia Pacific Web Conference
, pp. 26-39
-
-
Zobel, J.1
Bernstein, Y.2
|