-
1
-
-
56349136928
-
Random sampling from a search engine's index
-
Z. Bar-Yossef and M. Gurevich. Random sampling from a search engine's index. J. ACM, 55(5):1-74, 2008.
-
(2008)
J. ACM
, vol.55
, Issue.5
, pp. 1-74
-
-
Bar-Yossef, Z.1
Gurevich, M.2
-
2
-
-
0010012348
-
Machine-made index for technical literature experiment
-
October
-
P. Baxendale. Machine-made index for technical literature experiment. IBM J. Research and Development, 2:354-361, October 1958.
-
(1958)
IBM J. Research and Development
, vol.2
, pp. 354-361
-
-
Baxendale, P.1
-
3
-
-
20444387298
-
A technique for measuring the relative size and overlap of public web search engines
-
K. Bharat and A. Broder. A technique for measuring the relative size and overlap of public web search engines. Comput. Netw. ISDN Syst., 30(1-7):379-388, 1998.
-
(1998)
Comput. Netw. ISDN Syst
, vol.30
, Issue.1-7
, pp. 379-388
-
-
Bharat, K.1
Broder, A.2
-
4
-
-
80053375619
-
Large language models in machine translation
-
ACL
-
T. Brants, A. C. Popat, P. Xu, F. J. Och, and J. Dean. Large language models in machine translation. In Proc. Joint Conf. Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858-867. ACL, 2007.
-
(2007)
Proc. Joint Conf. Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
, pp. 858-867
-
-
Brants, T.1
Popat, A.C.2
Xu, P.3
Och, F.J.4
Dean, J.5
-
7
-
-
34547629212
-
Estimating corpus size via queries
-
ACM
-
A. Broder, M. Fontura, V. Josifovski, R. Kumar, R. Motwani, S. Nabar, R. Panigrahy, A. Tomkins, and Y. Xu. Estimating corpus size via queries. In Proc. Int. Conf. Info. and Knowledge Management (CIKM), pp. 594-603. ACM, 2006.
-
(2006)
Proc. Int. Conf. Info. and Knowledge Management (CIKM)
, pp. 594-603
-
-
Broder, A.1
Fontura, M.2
Josifovski, V.3
Kumar, R.4
Motwani, R.5
Nabar, S.6
Panigrahy, R.7
Tomkins, A.8
Xu, Y.9
-
9
-
-
79956075292
-
Identifying and filtering near-duplicate documents
-
Springer-Verlag
-
A. Z. Broder. Identifying and filtering near-duplicate documents. In Proc. Symp. Combinatorial Pattern Matching (COM), pp. 1-10. Springer-Verlag, 2000.
-
(2000)
Proc. Symp. Combinatorial Pattern Matching (COM)
, pp. 1-10
-
-
Broder, A.Z.1
-
10
-
-
0031620041
-
Min-wise independent permutations (extended abstract)
-
ACM
-
A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise independent permutations (extended abstract). In Proc. Symp. Theory of Computing (STOC), pp. 327-336. ACM, 1998.
-
(1998)
Proc. Symp. Theory of Computing (STOC)
, pp. 327-336
-
-
Broder, A.Z.1
Charikar, M.2
Frieze, A.M.3
Mitzenmacher, M.4
-
11
-
-
0010362121
-
Syntactic clustering of the web
-
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the web. Comput. Netw. ISDN Syst., 29(8-13):1157-1166, 1997.
-
(1997)
Comput. Netw. ISDN Syst
, vol.29
, Issue.8-13
, pp. 1157-1166
-
-
Broder, A.Z.1
Glassman, S.C.2
Manasse, M.S.3
Zweig, G.4
-
13
-
-
0036040277
-
Similarity estimation techniques from rounding algorithms
-
ACM
-
M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proc. Symp. Theory of Computing (STOC), pp. 380-388. ACM, 2002.
-
(2002)
Proc. Symp. Theory of Computing (STOC)
, pp. 380-388
-
-
Charikar, M.S.1
-
18
-
-
1842637192
-
Cumulated gain-based evaluation of IR techniques
-
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20(4):422-446, 2002.
-
(2002)
ACM Trans. Inf. Syst
, vol.20
, Issue.4
, pp. 422-446
-
-
Järvelin, K.1
Kekäläinen, J.2
-
19
-
-
0000159640
-
A statistical approach to mechanized encoding and searching of literary information
-
H. Luhn. A statistical approach to mechanized encoding and searching of literary information. IBM J. Research and Development, 1(4):309-317, 1957.
-
(1957)
IBM J. Research and Development
, vol.1
, Issue.4
, pp. 309-317
-
-
Luhn, H.1
-
20
-
-
0000880768
-
The automatic creation of literature abstracts
-
H. Luhn. The automatic creation of literature abstracts. IBM J. Research and Development, 2(2), 1958.
-
(1958)
IBM J. Research and Development
, vol.2
, Issue.2
-
-
Luhn, H.1
-
22
-
-
74549217709
-
Consistent weighted sampling of multisets and distributions
-
U.S. Patent Appl, Sep
-
F. D. McSherry, K. Talwar, and M. D. Manasse. Consistent weighted sampling of multisets and distributions. U.S. Patent Appl., Sep 2008.
-
(2008)
-
-
McSherry, F.D.1
Talwar, K.2
Manasse, M.D.3
-
24
-
-
0019695565
-
A performance evaluation of similarity measures, document term weighting schemes and representations in a boolean environment
-
ACM
-
T. Noreault, M. McGill, and M. Koll. A performance evaluation of similarity measures, document term weighting schemes and representations in a boolean environment. In Proc. of Conf. on Research and Dev. in Info. Retrieval (SIGIR), pp. 57-76. ACM, 1981.
-
(1981)
Proc. of Conf. on Research and Dev. in Info. Retrieval (SIGIR)
, pp. 57-76
-
-
Noreault, T.1
McGill, M.2
Koll, M.3
-
25
-
-
33845305256
-
Retrieving similar documents from the Web
-
A. Pereira Jr. and N. Ziviani. Retrieving similar documents from the Web. J. Web Engineering, 2(4):247-261, 2004.
-
(2004)
J. Web Engineering
, vol.2
, Issue.4
, pp. 247-261
-
-
Pereira Jr., A.1
Ziviani, N.2
-
28
-
-
0023981493
-
Historical note: Information retrieval and the future of an illusion
-
D. R. Swanson. Historical note: Information retrieval and the future of an illusion. J. American Society for Information Science, 39(2):92-98, 1988.
-
(1988)
J. American Society for Information Science
, vol.39
, Issue.2
, pp. 92-98
-
-
Swanson, D.R.1
-
29
-
-
35348878593
-
Designing efficient sampling techniques to detect webpage updates
-
ACM
-
Q. Tan, Z. Zhuang, P. Mitra, and C. L. Giles. Designing efficient sampling techniques to detect webpage updates. In Proc. Int. Conf. World Wide Web (WWW), pp. 1147-1148. ACM, 2007.
-
(2007)
Proc. Int. Conf. World Wide Web (WWW)
, pp. 1147-1148
-
-
Tan, Q.1
Zhuang, Z.2
Mitra, P.3
Giles, C.L.4
-
30
-
-
70349111073
-
Query by document
-
ACM
-
Y. Yang, N. Bansal, W. Dakka, P. Ipeirotis, N. Koudas, and D. Papadias. Query by document. In Proc. Int. Conf. Web Search and Data Mining, pp. 34-43. ACM, 2009.
-
(2009)
Proc. Int. Conf. Web Search and Data Mining
, pp. 34-43
-
-
Yang, Y.1
Bansal, N.2
Dakka, W.3
Ipeirotis, P.4
Koudas, N.5
Papadias, D.6
|