메뉴 건너뛰기




Volumn , Issue , 2005, Pages 864-872

Crawling a country: Better strategies than breadth-first for web page ordering

Author keywords

Scheduling policy; Web crawler; Web page importance

Indexed keywords

BREADTH-FIRST; BREADTH-FIRST SEARCH; ORDERING STRATEGY; PAGERANK; SCHEDULING POLICIES; WEB CRAWLERS; WEB CRAWLING; WEB GRAPHS; WEB PAGE;

EID: 77953053635     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1062745.1062768     Document Type: Conference Paper
Times cited : (100)

References (59)
  • 1
    • 77953036485 scopus 로고    scopus 로고
    • Robotcop. www.robotcop.org, 2002.
    • (2002)
  • 2
    • 15844418414 scopus 로고    scopus 로고
    • HT://Dig. http://www.htdig.org/, 2004. GPL software.
    • (2004) GPL Software
  • 4
    • 15844418414 scopus 로고    scopus 로고
    • S. Ailleret. Larbin. http://larbin.sourceforge.net/index-eng.html, 2004. GPL software.
    • (2004) GPL Software
    • Ailleret, S.1
  • 7
    • 35048834240 scopus 로고    scopus 로고
    • Crawling the infinite Web: Five levels are enough
    • Proceedings of the third Workshop on Web Graphs (WAW), Rome, Italy, October Springer
    • R. Baeza-Yates and C. Castillo. Crawling the infinite Web: five levels are enough. In Proceedings of the third Workshop on Web Graphs (WAW), volume 3243 of Lecture Notes in Computer Science, pages 156-167, Rome, Italy, October 2004. Springer.
    • (2004) Lecture Notes in Computer Science , vol.3243 , pp. 156-167
    • Baeza-Yates, R.1    Castillo, C.2
  • 9
    • 84958778546 scopus 로고    scopus 로고
    • Web structure, dynamics and page quality
    • Proceedings of String Processing and Information Retrieval (SPIRE), Lisbon, Portugal, Springer
    • R. Baeza-Yates, F. Saint-Jean, and C. Castillo. Web structure, dynamics and page quality. In Proceedings of String Processing and Information Retrieval (SPIRE), volume 2476 of Lecture Notes in Computer Science, pages 117 - 132, Lisbon, Portugal, 2002. Springer.
    • (2002) Lecture Notes in Computer Science , vol.2476 , pp. 117-132
    • Baeza-Yates, R.1    Saint-Jean, F.2    Castillo, C.3
  • 11
    • 35048887031 scopus 로고    scopus 로고
    • Do your worst to make the best: Paradoxical effects in pagerank incremental computations
    • Proceedings of the third Workshop on Web Graphs (WAW), Rome, Italy, October Springer
    • P. Boldi, M. Santini, and S. Vigna. Do your worst to make the best: Paradoxical effects in pagerank incremental computations. In Proceedings of the third Workshop on Web Graphs (WAW), volume 3243 of Lecture Notes in Computer Science, pages 168-180, Rome, Italy, October 2004. Springer.
    • (2004) Lecture Notes in Computer Science , vol.3243 , pp. 168-180
    • Boldi, P.1    Santini, M.2    Vigna, S.3
  • 13
    • 0038589165 scopus 로고    scopus 로고
    • The anatomy of a large-scale hypertextual Web search engine
    • April
    • S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-7):107-117, April 1998.
    • (1998) Computer Networks and ISDN Systems , vol.30 , Issue.1-7 , pp. 107-117
    • Brin, S.1    Page, L.2
  • 15
    • 0342652248 scopus 로고    scopus 로고
    • Crawling towards eternity - Building an archive of the world wide web
    • May
    • M. Burner. Crawling towards eternity - building an archive of the world wide web. Web Techniques, 2(5), May 1997.
    • (1997) Web Techniques , vol.2 , Issue.5
    • Burner, M.1
  • 18
    • 0033294474 scopus 로고    scopus 로고
    • Focused crawling: A new approach to topic-specific web resource discovery
    • S. Chakrabarti, M. van den Berg, and B. Dom. Focused crawling: a new approach to topic-specific web resource discovery. Computer Networks, 31(11-16):1623-1640, 1999.
    • (1999) Computer Networks , vol.31 , Issue.11-16 , pp. 1623-1640
    • Chakrabarti, S.1    Van Den Berg, M.2    Dom, B.3
  • 27
    • 77953035448 scopus 로고    scopus 로고
    • WebBase
    • L. Dacharay. WebBase. http://freesoftware.fsf.org/webbase/, 2002. GPL Software.
    • (2002) GPL Software
    • Dacharay, L.1
  • 30
    • 0002371171 scopus 로고    scopus 로고
    • Optimal robot scheduling for web search engines
    • R. W. Edward G. Coffman, Z. Liu. Optimal robot scheduling for web search engines. Journal of Scheduling, 1(1):15-29, 1998.
    • (1998) Journal of Scheduling , vol.1 , Issue.1 , pp. 15-29
    • Edward, R.W.1    Coffman, G.2    Liu, Z.3
  • 31
    • 84874252492 scopus 로고    scopus 로고
    • An adaptive model for optimizing performance of an incremental web crawler
    • Hong Kong, May Elsevier Science
    • J. Edwards, K. S. McCurley, and J. A. Tomlin. An adaptive model for optimizing performance of an incremental web crawler. In Proceedings of the Tenth Conference on World Wide Web, pages 106-113, Hong Kong, May 2001. Elsevier Science.
    • (2001) Proceedings of the Tenth Conference on World Wide Web , pp. 106-113
    • Edwards, J.1    McCurley, K.S.2    Tomlin, J.A.3
  • 38
    • 79951675059 scopus 로고    scopus 로고
    • Mercator: A scalable, extensible web crawler
    • April
    • A. Heydon and M. Najork. Mercator: A scalable, extensible web crawler. World Wide Web Conference, 2(4):219-229, April 1999.
    • (1999) World Wide Web Conference , vol.2 , Issue.4 , pp. 219-229
    • Heydon, A.1    Najork, M.2
  • 40
    • 0040511952 scopus 로고
    • Robots in the web: Threat or treat?
    • April
    • M. Koster. Robots in the web: threat or treat ? ConneXions, 9(4), April 1995.
    • (1995) ConneXions , vol.9 , Issue.4
    • Koster, M.1
  • 42
    • 0000806922 scopus 로고    scopus 로고
    • Automating the construction of internet portals with machine learning
    • A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval, 3(2):127-163, 2000.
    • (2000) Information Retrieval , vol.3 , Issue.2 , pp. 127-163
    • McCallum, A.K.1    Nigam, K.2    Rennie, J.3    Seymore, K.4
  • 45
    • 85006710010 scopus 로고    scopus 로고
    • Breadth-first crawling yields high-quality pages
    • Hong Kong, May Elsevier Science
    • M. Najork and J. L. Wiener. Breadth-first crawling yields high-quality pages. In Proceedings of the Tenth Conference on World Wide Web, pages 114-118, Hong Kong, May 2001. Elsevier Science.
    • (2001) Proceedings of the Tenth Conference on World Wide Web , pp. 114-118
    • Najork, M.1    Wiener, J.L.2
  • 46
    • 15844394231 scopus 로고    scopus 로고
    • What's new on the web?: The evolution of the web from a search engine perspective
    • New York, NY, USA, May ACM Press
    • A. Ntoulas, J. Cho, and C. Olston. What's new on the web?: the evolution of the web from a search engine perspective. In Proceedings of the 13th conference on World Wide Web, pages 1-12, New York, NY, USA, May 2004. ACM Press.
    • (2004) Proceedings of the 13th Conference on World Wide Web , pp. 1-12
    • Ntoulas, A.1    Cho, J.2    Olston, C.3
  • 48
    • 35048813582 scopus 로고    scopus 로고
    • Search engine-crawler symbiosis
    • Proceedings of the European Conference on Digital Libraries (ECDL), Springer, August
    • G. Pant, S. Bradshaw, and F. Menczer. Search engine-crawler symbiosis. In Proceedings of the European Conference on Digital Libraries (ECDL), volume 2769 of Lecture Notes in Computer Science, pages 221-232. Springer, August 2003.
    • (2003) Lecture Notes in Computer Science , vol.2769 , pp. 221-232
    • Pant, G.1    Bradshaw, S.2    Menczer, F.3
  • 51
    • 0037150740 scopus 로고    scopus 로고
    • Search engines and web dynamics
    • June
    • K. M. Risvik and R. Michelsen. Search engines and web dynamics. Computer Networks, 39(3), June 2002.
    • (2002) Computer Networks , vol.39 , Issue.3
    • Risvik, K.M.1    Michelsen, R.2
  • 52
    • 0036204395 scopus 로고    scopus 로고
    • Design and implementation of a high-performance distributed web crawler
    • San Jose, California, February IEEE CS Press
    • V. Shkapenyuk and T. Suel. Design and implementation of a high-performance distributed web crawler. In Proceedings of the 18th International Conference on Data Engineering (ICDE), pages 357-368, San Jose, California, February 2002. IEEE CS Press.
    • (2002) Proceedings of the 18th International Conference on Data Engineering (ICDE) , pp. 357-368
    • Shkapenyuk, V.1    Suel, T.2
  • 54
    • 0036109905 scopus 로고    scopus 로고
    • Discovery of web robot sessions based on their navigational patterns
    • DOI 10.1023/A:1013228602957
    • P.-N. Tan and V. Kumar. Discovery of web robots session based on their navigational patterns. Data Mining and Knowledge discovery, 6(1):9-35, 2002. (Pubitemid 37113874)
    • (2002) Data Mining and Knowledge Discovery , vol.6 , Issue.1 , pp. 9-35
    • Tan, P.-N.1    Kumar, V.2
  • 55
    • 0003962632 scopus 로고    scopus 로고
    • Country Profiles
    • The Economist. Country Profiles, 2002.
    • (2002) The Economist
  • 56
    • 77953073641 scopus 로고    scopus 로고
    • United Nations. Population Division, 2002
    • United Nations. Population Division, 2002.
  • 58
    • 84937389622 scopus 로고    scopus 로고
    • Design and implementation of a distributed crawler and filtering processor
    • Proceedings of the fifth Next Generation Information Technologies and Systems (NGITS), Caesarea, Israel, June Springer
    • D. Zeinalipour-Yazti and M. D. Dikaiakos. Design and implementation of a distributed crawler and filtering processor. In Proceedings of the fifth Next Generation Information Technologies and Systems (NGITS), volume 2382 of Lecture Notes in Computer Science, pages 58-74, Caesarea, Israel, June 2002. Springer.
    • (2002) Lecture Notes in Computer Science , vol.2382 , pp. 58-74
    • Zeinalipour-Yazti, D.1    Dikaiakos, M.D.2
  • 59
    • 35048868826 scopus 로고    scopus 로고
    • Making eigenvector-based reputation systems robust to collusion
    • Proceedings of the third Workshop on Web Graphs (WAW), Rome, Italy, October Springer
    • H. Zhang, A. Goel, R. Govindan, K. Mason, and B. V. Roy. Making eigenvector-based reputation systems robust to collusion. In Proceedings of the third Workshop on Web Graphs (WAW), volume 3243 of Lecture Notes in Computer Science, pages 92-104, Rome, Italy, October 2004. Springer.
    • (2004) Lecture Notes in Computer Science , vol.3243 , pp. 92-104
    • Zhang, H.1    Goel, A.2    Govindan, R.3    Mason, K.4    Roy, B.V.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.