-
1
-
-
1042273235
-
Zipf's Law and the internet
-
L. A. Adamic and B. A. Huberman. Zipf's Law and the Internet. Glottometrics, 3:143-150, 2002.
-
(2002)
Glottometrics
, vol.3
, pp. 143-150
-
-
Adamic, L.A.1
Huberman, B.A.2
-
2
-
-
33646341987
-
Methods for comparing rankings of search engine results
-
J. Bar-Ilan, M. Mat-Hassan, and M. Levene. Methods for Comparing Rankings of Search Engine Results. Computer Networks, 50(10):1448-1463, 2006.
-
(2006)
Computer Networks
, vol.50
, Issue.10
, pp. 1448-1463
-
-
Bar-Ilan, J.1
Mat-Hassan, M.2
Levene, M.3
-
5
-
-
77951111535
-
-
A. Franz and T. Brants. All Our N-Gram are Belong to You. http://googleresearch.blogspot.com/ 2006/08/all-our-n-gram-are-belong-to-you. html.
-
-
-
Franz, A.1
Brants, T.2
-
7
-
-
0003700089
-
-
NIST Special Publication 500-249: TREC-9
-
D. Hawking. Overview of the TREC-9 Web Track. In NIST Special Publication 500-249: TREC-9, pages 87-102, 2001.
-
(2001)
Overview of the TREC-9 Web Track
, pp. 87-102
-
-
Hawking, D.1
-
8
-
-
2442626107
-
Distributed search over the hidden web: Hierarchical database sampling and selection
-
VLDB Endowment
-
P. G. Ipeirotis and L. Gravano. Distributed search over the hidden web: Hierarchical database sampling and selection. In Proceedings of VLDB '02, pages 394-405. VLDB Endowment, 2002.
-
(2002)
Proceedings of VLDB '02
, pp. 394-405
-
-
Ipeirotis, P.G.1
Gravano, L.2
-
9
-
-
0344154400
-
Using the web to obtain frequencies for unseen bigrams
-
F. Keller and M. Lapata. Using the Web to Obtain Frequencies for Unseen Bigrams. Computational Linguistics, 29(3):459-484, 2003.
-
(2003)
Computational Linguistics
, vol.29
, Issue.3
, pp. 459-484
-
-
Keller, F.1
Lapata, M.2
-
10
-
-
77951145238
-
Approximating document frequency with term count values
-
Old Dominion University
-
M. Klein and M. L. Nelson. Approximating Document Frequency with Term Count Values. Technical Report arXiv:0807.3755, Old Dominion University, 2008.
-
(2008)
Technical Report arXiv:0807.3755
-
-
Klein, M.1
Nelson, M.L.2
-
11
-
-
70450246189
-
Revisiting lexical signatures to (Re-)discover web pages
-
M. Klein and M. L. Nelson. Revisiting Lexical Signatures to (Re-)Discover Web Pages. In Proceedings of ECDL '08, 2008.
-
(2008)
Proceedings of ECDL '08
-
-
Klein, M.1
Nelson, M.L.2
-
12
-
-
12244261882
-
Improved robustness of signature-based near-replica detection via lexicon randomization
-
A. Kolcz, A. Chowdhury, and J. Alspector. Improved Robustness of Signature-Based Near-Replica Detection via Lexicon Randomization. In Proceedings of KDD '04, pages 605-610, 2004.
-
(2004)
Proceedings of KDD '04
, pp. 605-610
-
-
Kolcz, A.1
Chowdhury, A.2
Alspector, J.3
-
14
-
-
33746035286
-
Automated extraction of hit numbers from search result pages
-
Y. Ling, X. Meng, and W. Meng. Automated extraction of hit numbers from search result pages. In Proceedings of WAIM '06, pages 73-84, 2006.
-
(2006)
Proceedings of WAIM '06
, pp. 73-84
-
-
Ling, Y.1
Meng, X.2
Meng, W.3
-
15
-
-
36349016704
-
Agreeing to disagree: Search engines and their public interfaces
-
F. McCown and M. L. Nelson. Agreeing to Disagree: Search Engines and their Public Interfaces. In Proceedings of JCDL '07, pages 309-318, 2007.
-
(2007)
Proceedings of JCDL '07
, pp. 309-318
-
-
McCown, F.1
Nelson, M.L.2
-
16
-
-
34547317670
-
Lazy preservation: Reconstructing websites by crawling the crawlers
-
F. McCown, J. A. Smith, and M. L. Nelson. Lazy Preservation: Reconstructing Websites by Crawling the Crawlers. In Proceedings of WIDM '06, pages 67-74, 2006.
-
(2006)
Proceedings of WIDM '06
, pp. 67-74
-
-
McCown, F.1
Smith, J.A.2
Nelson, M.L.3
-
17
-
-
84962711699
-
A study of using search engine page hits as a proxy for n-gram frequencies
-
P. Nakov and M. Hearst. A Study of Using Search Engine Page Hits as a Proxy for n-gram Frequencies. In Proceedings of RANLP '05, 2005.
-
(2005)
Proceedings of RANLP '05
-
-
Nakov, P.1
Hearst, M.2
-
18
-
-
9144269133
-
Analysis of lexical signatures for improving information persistence on the world wide web
-
S.-T. Park, D. M. Pennock, C. L. Giles, and R. Krovetz. Analysis of Lexical Signatures for Improving Information Persistence on the World Wide Web. ACM Transactions on Information Systems, 22(4):540-572, 2004.
-
(2004)
ACM Transactions on Information Systems
, vol.22
, Issue.4
, pp. 540-572
-
-
Park, S.-T.1
Pennock, D.M.2
Giles, C.L.3
Krovetz, R.4
-
23
-
-
33745644751
-
Wordrank-based lexical signatures for finding lost or related web pages
-
X. Wan and J. Yang. Wordrank-based Lexical Signatures for Finding Lost or Related Web Pages. In APWeb, pages 843-849, 2006.
-
(2006)
APWeb
, pp. 843-849
-
-
Wan, X.1
Yang, J.2
-
24
-
-
0034852836
-
Improving trigram language modeling with the world wide web
-
X. Zhu and R. Rosenfeld. Improving Trigram Language Modeling with the World Wide Web. In Proceedings of ICASSP '01, pages 533-536, 2001.
-
(2001)
Proceedings of ICASSP '01
, pp. 53-536
-
-
Zhu, X.1
Rosenfeld, R.2
|