메뉴 건너뛰기




Volumn 3, Issue 3, 2009, Pages

IRLbot: Scaling to 6 billion pages and beyond

Author keywords

Crawling; IRLbot; Large scale

Indexed keywords

CRAWLING; HTML PAGES; IRLBOT; LARGE SCALE; SINGLE SERVER; WEB CRAWLERS; WEB GRAPHS;

EID: 68549090722     PISSN: 15591131     EISSN: 1559114X     Source Type: Journal    
DOI: 10.1145/1541822.1541823     Document Type: Article
Times cited : (25)

References (40)
  • 2
    • 84880240041 scopus 로고    scopus 로고
    • ARASU, A., CHO, J., GARCIA-MOLINA, H., PAEPCKE, A., AND RAGHAVAN, S. 2001. Searching the Web. ACM Trans. Internet Technol.1, 1, 2-43.
    • ARASU, A., CHO, J., GARCIA-MOLINA, H., PAEPCKE, A., AND RAGHAVAN, S. 2001. Searching the Web. ACM Trans. Internet Technol.1, 1, 2-43.
  • 4
    • 3042680184 scopus 로고    scopus 로고
    • Ubicrawler: A scalable fully distributed Web crawler
    • BOLDI, P., CODENOTTI, B., SANTINI, M., AND VIGNA, S. 2004a. Ubicrawler: A scalable fully distributed Web crawler. Softw. Pract. Exper. 34, 8, 711-726.
    • (2004) Softw. Pract. Exper , vol.34 , Issue.8 , pp. 711-726
    • BOLDI, P.1    CODENOTTI, B.2    SANTINI, M.3    VIGNA, S.4
  • 5
    • 35048887031 scopus 로고    scopus 로고
    • Do your worst to make the best: Paradoxical effects in pagerank incremental computations
    • Algorithms and Models for the Web-Graph, Springer
    • BOLDI, P., SANTINI, M., AND VIGNA, S. 2004b. Do your worst to make the best: Paradoxical effects in pagerank incremental computations. In Algorithms and Models for the Web-Graph. Lecture Notes in Computer Science, vol. 3243. Springer,168-180.
    • (2004) Lecture Notes in Computer Science , vol.3243 , pp. 168-180
    • BOLDI, P.1    SANTINI, M.2    VIGNA, S.3
  • 9
    • 0342652248 scopus 로고    scopus 로고
    • Crawling towards eternity: Building an archive of the World Wide Web
    • BURNER, M. 1997. Crawling towards eternity: Building an archive of the World Wide Web. Web Techn. Mag. 2, 5.
    • (1997) Web Techn. Mag. 2 , pp. 5
    • BURNER, M.1
  • 12
    • 33747096982 scopus 로고    scopus 로고
    • CHO, J., GARCIA-MOLINA, H., HAVELIWALA, T., LAM, W., PAEPCKE, A., AND WESLEY, S. R. G. 2006. Stanford Web base components and applications. ACM Trans. Internet Technol. 6, 2, 153- 186.
    • CHO, J., GARCIA-MOLINA, H., HAVELIWALA, T., LAM, W., PAEPCKE, A., AND WESLEY, S. R. G. 2006. Stanford Web base components and applications. ACM Trans. Internet Technol. 6, 2, 153- 186.
  • 14
    • 9944234613 scopus 로고
    • The rbse spider - Balancing effective search against Web load
    • EICHMANN, D. 1994. The rbse spider - Balancing effective search against Web load. In World Wide Web Conference.
    • (1994) World Wide Web Conference
    • EICHMANN, D.1
  • 16
    • 57349190615 scopus 로고    scopus 로고
    • Scalable computing for power law graphs: Experience with parallel pagerank
    • GLEICH, D. AND ZHUKOV, L. 2005. Scalable computing for power law graphs: Experience with parallel pagerank. In Proceedings of SuperComputing.
    • (2005) Proceedings of SuperComputing
    • GLEICH, D.1    ZHUKOV, L.2
  • 20
    • 79951675059 scopus 로고    scopus 로고
    • Mercator: A scalable, extensible Web crawler
    • HEYDON, A. AND NAJORK, M. 1999. Mercator: A scalable, extensible Web crawler. World Wide Web 2, 4, 219-229.
    • (1999) World Wide Web , vol.2 , Issue.4 , pp. 219-229
    • HEYDON, A.1    NAJORK, M.2
  • 22
    • 84870439191 scopus 로고    scopus 로고
    • INTERNET ARCHIVE, home
    • INTERNET ARCHIVE. Internet archive homepage. http://www.archive.org/.
    • Internet archive
  • 24
    • 17444376867 scopus 로고    scopus 로고
    • Exploiting the block structure of the Web for computing pagerank
    • Tech. rep, Stanford University
    • KAMVAR, S. D., HAVELIWALA, T. H., MANNING, C. D., AND GOLUB, G. H. 2003a. Exploiting the block structure of the Web for computing pagerank. Tech. rep., Stanford University.
    • (2003)
    • KAMVAR, S.D.1    HAVELIWALA, T.H.2    MANNING, C.D.3    GOLUB, G.H.4
  • 29
    • 0344079714 scopus 로고    scopus 로고
    • Lycos: Design choices in an Internet search service
    • MAULDIN, M. 1997. Lycos: Design choices in an Internet search service. IEEE Expert Mag. 12, 1, 8-11.
    • (1997) IEEE Expert Mag , vol.12 , Issue.1 , pp. 8-11
    • MAULDIN, M.1
  • 31
    • 0013238179 scopus 로고    scopus 로고
    • High-performance Web crawling
    • Compaq Systems Research Center
    • NAJORK, M. AND HEYDON, A. 2001. High-performance Web crawling. Tech: rep. 173, Compaq Systems Research Center.
    • (2001) Tech: Rep , vol.173
    • NAJORK, M.1    HEYDON, A.2
  • 33
    • 71449127664 scopus 로고    scopus 로고
    • We knew the Web was big
    • OFFICIAL GOOGLE BLOG. 2008. We knew the Web was big... http://googleblog.blogspot.com/ 2008/07/we- knew- web- was- big.html.
    • (2008)
    • OFFICIAL GOOGLE, B.1
  • 34
    • 0343374008 scopus 로고
    • Finding what people want: Experiences with the Web crawler
    • PINKERTON, B. 1994. Finding what people want: Experiences with the Web crawler. In World Wide Web Conference (WWW'94).
    • (1994) World Wide Web Conference (WWW'94)
    • PINKERTON, B.1
  • 39
    • 0001321490 scopus 로고    scopus 로고
    • External memory algorithms and data structures: Dealing with massive data
    • VITTER, J. 2001. External memory algorithms and data structures: Dealing with massive data. ACM Comput. Surv. 33, 2, 209-271.
    • (2001) ACM Comput. Surv , vol.33 , Issue.2 , pp. 209-271
    • VITTER, J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.