메뉴 건너뛰기




Volumn , Issue , 2009, Pages 1987-1990

URL normalization for de-duplication of web pages

Author keywords

Decision tree; Page importance; Search engines; URL de duplication

Indexed keywords

BUILDING BLOCKES; EXPERIMENTAL EVALUATION; MACHINE LEARNING TECHNIQUES; MINE RULES; PAGE IMPORTANCE; RULE EXTRACTION; SET OF RULES; URL NORMALIZATION; WEB PAGE; WEB SEARCHES;

EID: 74549172900     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1645953.1646283     Document Type: Conference Paper
Times cited : (24)

References (12)
  • 8
    • 33750296887 scopus 로고    scopus 로고
    • M. Henzinger. Finding near-duplicate web pages: a large-scale evaluation of algorithms. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 284{291, August 2006.
    • M. Henzinger. Finding near-duplicate web pages: a large-scale evaluation of algorithms. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 284{291, August 2006.
  • 9
    • 17444376867 scopus 로고    scopus 로고
    • Exploiting the block structure of the web for computing pagerank
    • Technical report, Stanford University
    • S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Exploiting the block structure of the web for computing pagerank. Technical report, Stanford University, 2003.
    • (2003)
    • Kamvar, S.1    Haveliwala, T.2    Manning, C.3    Golub, G.4
  • 10
    • 35348911985 scopus 로고    scopus 로고
    • G. S. Manku, A. Jain, and A. D. Sarma. Detecting near-duplicates for web crawling. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 141{150, May 2007.
    • G. S. Manku, A. Jain, and A. D. Sarma. Detecting near-duplicates for web crawling. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 141{150, May 2007.
  • 11
    • 0003780986 scopus 로고    scopus 로고
    • The pagerank citation ranking: Bringing order to the web
    • Technical report, Stanford InfoLab, November 1999
    • L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, November 1999.
    • Page, L.1    Brin, S.2    Motwani, R.3    Winograd, T.4
  • 12
    • 33744584654 scopus 로고
    • Induction of decision trees
    • 1(1):81{106, March
    • J. R. Quinlan. Induction of decision trees. Mach. Learn., 1(1):81{106, March 1986.
    • (1986) Mach. Learn
    • Quinlan, J.R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.