-
1
-
-
1042273235
-
Zipf's law and the Internet
-
L. A. Adamic and B. A. Huberman. Zipf's law and the Internet. Glottometrics, 3:143-150, 2002.
-
(2002)
Glottometrics
, vol.3
, pp. 143-150
-
-
Adamic, L.A.1
Huberman, B.A.2
-
2
-
-
36348939065
-
-
Alexa toolbar, http://download.alexa.com/.
-
Alexa toolbar
-
-
-
3
-
-
19944375505
-
Sic transit gloria telae: Towards an understanding of the web's decay
-
Z. Bar-Yossef, A. Z. Broder, R. Kumar, and A. Tomkins. Sic transit gloria telae: towards an understanding of the web's decay. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 328-337, 2004.
-
(2004)
WWW '04: Proceedings of the 13th international conference on World Wide Web
, pp. 328-337
-
-
Bar-Yossef, Z.1
Broder, A.Z.2
Kumar, R.3
Tomkins, A.4
-
6
-
-
0010362121
-
Syntactic clustering of the Web
-
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the Web. Computer Networks & ISDN Systems, 29(8-13):1157-1166, 1997.
-
(1997)
Computer Networks & ISDN Systems
, vol.29
, Issue.8-13
, pp. 1157-1166
-
-
Broder, A.Z.1
Glassman, S.C.2
Manasse, M.S.3
Zweig, G.4
-
8
-
-
36348975596
-
-
D. Clinton. Beyond the SOAP search API, Dec. 2006. http://google-code- updates.blogspot.com/2006/ 12/beyond-soap-search-api.html. [9] M. Cutts. GoogleGuy's posts, June 2005. http: //www.webmasterworld.com/forum30/29720.htm.
-
D. Clinton. Beyond the SOAP search API, Dec. 2006. http://google-code- updates.blogspot.com/2006/ 12/beyond-soap-search-api.html. [9] M. Cutts. GoogleGuy's posts, June 2005. http: //www.webmasterworld.com/forum30/29720.htm.
-
-
-
-
9
-
-
4944222857
-
Managing distributed collections: Evaluating web page changes, movement, and replacement
-
Z. Dalai, S. Dash, P. Dave, L. Francisco-Revilla, R. Furuta, U. Karadkar, and F. Shipman. Managing distributed collections: Evaluating web page changes, movement, and replacement. In JCDL '04: Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, pages 160-168, 2004.
-
(2004)
JCDL '04: Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
, pp. 160-168
-
-
Dalai, Z.1
Dash, S.2
Dave, P.3
Francisco-Revilla, L.4
Furuta, R.5
Karadkar, U.6
Shipman, F.7
-
11
-
-
84880492977
-
A large-scale study of the evolution of web
-
D. Fetterly, M. Manasse, M. Najork, and J. Wiener. A large-scale study of the evolution of web pages. In WWW '03: Proceedings of the 12th international conference on World Wide Web, pages 669-678, 2003.
-
(2003)
WWW '03: Proceedings of the 12th international conference on World Wide Web
, pp. 669-678
-
-
Fetterly, D.1
Manasse, M.2
Najork, M.3
Wiener, J.4
-
12
-
-
36348963995
-
-
J. Gait. Google says: Toolbar PageRank is for entertainment purposes only, 2004. http://forums, searchenginewatch.com/showthread.php?t=3054.
-
J. Gait. Google says: Toolbar PageRank is for entertainment purposes only, 2004. http://forums, searchenginewatch.com/showthread.php?t=3054.
-
-
-
-
13
-
-
36349032101
-
-
Google Sitemap Protocol, https : //www. google. com/ webmasters/tools/ docs/en/protocol.html.
-
Google Sitemap Protocol
-
-
-
14
-
-
36348971503
-
-
Google webmaster help center
-
Google webmaster help center: Webmaster guidelines, 2007. http ://www.google.com/support/webmasters/ bin/answer.py?answer=35769.
-
(2007)
Webmaster guidelines
-
-
-
20
-
-
0035251434
-
Persistence of web references in scientific research
-
S. Lawrence, D. M. Pennock, G. W. Flake, R. Krovetz, F. M. Coetzee, E. Glover, F. A. Nielsen, A. Kruger, and C. L. Giles. Persistence of web references in scientific research. Computer, 34(2):26-31, 2001.
-
(2001)
Computer
, vol.34
, Issue.2
, pp. 26-31
-
-
Lawrence, S.1
Pennock, D.M.2
Flake, G.W.3
Krovetz, R.4
Coetzee, F.M.5
Glover, E.6
Nielsen, F.A.7
Kruger, A.8
Giles, C.L.9
-
23
-
-
33645113275
-
Search engine coverage of the OAI-PMH corpus
-
Mar/Apr
-
F. McCown, X. Liu, M. L. Nelson, and M. Zubair. Search engine coverage of the OAI-PMH corpus. IEEE Internet Computing, 10(2):66-73, Mar/Apr 2006.
-
(2006)
IEEE Internet Computing
, vol.10
, Issue.2
, pp. 66-73
-
-
McCown, F.1
Liu, X.2
Nelson, M.L.3
Zubair, M.4
-
27
-
-
34547317670
-
Lazy preservation: Reconstructing websites by crawling the crawlers
-
F. McCown, J. A. Smith, M. L. Nelson, and J. Bollen. Lazy preservation: Reconstructing websites by crawling the crawlers. In Proceedings from the 8th A CM International Workshop on Web Information and Data Management (WIDM '06), pages 67-74, 2006.
-
(2006)
Proceedings from the 8th A CM International Workshop on Web Information and Data Management (WIDM '06)
, pp. 67-74
-
-
McCown, F.1
Smith, J.A.2
Nelson, M.L.3
Bollen, J.4
-
28
-
-
34247365334
-
An introduction to Heritrix, an archival quality web crawler
-
Sept
-
G. Mohr, M. Kimpton, M. Stack, and I. Ranitovic. An introduction to Heritrix, an archival quality web crawler. In Proceedings of the 4th International Web Archiving Workshop (IWAW '04), Sept. 2004.
-
(2004)
Proceedings of the 4th International Web Archiving Workshop (IWAW '04)
-
-
Mohr, G.1
Kimpton, M.2
Stack, M.3
Ranitovic, I.4
-
29
-
-
3042589238
-
Object persistence and availability in digital libraries
-
M. L. Nelson and B. D. Allen. Object persistence and availability in digital libraries. D-Lib Magazine, 8(1), 2002.
-
(2002)
D-Lib Magazine
, vol.8
, Issue.1
-
-
Nelson, M.L.1
Allen, B.D.2
-
30
-
-
34547614446
-
Efficient, automatic web resource harvesting
-
M. L. Nelson, J. A. Smith, I. Garcia del Campo, H. Van de Sompel, and X. Liu. Efficient, automatic web resource harvesting. In Proceedings from the 8th ACM International Workshop on Web Information and Data Management (WIDM '06), pages 43-50, 2006.
-
(2006)
Proceedings from the 8th ACM International Workshop on Web Information and Data Management (WIDM '06)
, pp. 43-50
-
-
Nelson, M.L.1
Smith, J.A.2
Garcia del Campo, I.3
Van de Sompel, H.4
Liu, X.5
-
32
-
-
36348995209
-
Court backs thumbnail image linking
-
July
-
S. Olsen. Court backs thumbnail image linking. CNET News.com, July 2003. http://news.com.com/2100-1025.3-1023629.html.
-
(2003)
CNET News.com
-
-
Olsen, S.1
-
33
-
-
36348965129
-
Google cache raises copyright concerns
-
July
-
S. Olsen. Google cache raises copyright concerns. CNET News.com, July 2003. http ://news.com.com/2100-1038.3-1024234.html.
-
(2003)
CNET News.com
-
-
Olsen, S.1
-
34
-
-
0036339310
-
Methodologies for crawler based web surveys
-
M. Thelwall. Methodologies for crawler based web surveys. Internet Research, 12(2):124-138, 2002.
-
(2002)
Internet Research
, vol.12
, Issue.2
, pp. 124-138
-
-
Thelwall, M.1
-
35
-
-
33750897224
-
Web crawling ethics revisited: Cost, privacy, and denial of service
-
M. Thelwall and D. Stuart. Web crawling ethics revisited: Cost, privacy, and denial of service. Journal of the American Society for Information Science and Technology, 57(13):1771-1779, 2006.
-
(2006)
Journal of the American Society for Information Science and Technology
, vol.57
, Issue.13
, pp. 1771-1779
-
-
Thelwall, M.1
Stuart, D.2
-
36
-
-
0242450486
-
A fair history of the Web? Examining country balance in the Internet Archive
-
M. Thelwall and L. Vaughan. A fair history of the Web? Examining country balance in the Internet Archive. Library & Information Science Research, 26(2):162-176, 2004.
-
(2004)
Library & Information Science Research
, vol.26
, Issue.2
, pp. 162-176
-
-
Thelwall, M.1
Vaughan, L.2
-
37
-
-
2942610876
-
Search engine coverage bias: Evidence and possible causes
-
L. Vaughan and M. Thelwall. Search engine coverage bias: Evidence and possible causes. Information Processing & Management, 40(4):693-707, 2004.
-
(2004)
Information Processing & Management
, vol.40
, Issue.4
, pp. 693-707
-
-
Vaughan, L.1
Thelwall, M.2
|