메뉴 건너뛰기




Volumn , Issue , 2010, Pages 59-68

Evaluating methods to rediscover missing web pages from the web infrastructure

Author keywords

Digital preservation; Search engines; Web page discovery

Indexed keywords

AUTOMATED METHODS; BOOKMARKING; DIGITAL PRESERVATION; EVALUATING METHOD; INTERNET SEARCH ENGINE; RETRIEVAL PERFORMANCE; WEB INFRASTRUCTURE; WEB PAGE;

EID: 77955099622     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1816123.1816133     Document Type: Conference Paper
Times cited : (13)

References (40)
  • 2
    • 77955119800 scopus 로고    scopus 로고
    • The size of the World Wide Web. http://www.worldwidewebsize.com/.
  • 3
    • 0345134365 scopus 로고    scopus 로고
    • Electronic document addressing: Dealing with change
    • H. Ashman. Electronic document addressing: Dealing with change. ACM Computing Surveys, 32(3):201-212, 2000.
    • (2000) ACM Computing Surveys , vol.32 , Issue.3 , pp. 201-212
    • Ashman, H.1
  • 5
    • 57349156145 scopus 로고    scopus 로고
    • Genealogical trees on the web: A search engine user perspective
    • R. Baeza-Yates, Álvaro Pereira, and N. Ziviani. Genealogical trees on the web: a search engine user perspective. In Proceeding of WWW '08, pages 367-376, 2008.
    • (2008) Proceeding of WWW '08 , pp. 367-376
    • Baeza-Yates, R.1    Pereira, A.2    Ziviani, N.3
  • 9
    • 0031623133 scopus 로고    scopus 로고
    • Referential integrity of links in open hypermedia systems
    • H. C. Davis. Referential integrity of links in open hypermedia systems. In Proceedings of HYPERTEXT '98, pages 207-216, 1998.
    • (1998) Proceedings of HYPERTEXT '98 , pp. 207-216
    • Davis, H.C.1
  • 10
    • 0033650817 scopus 로고    scopus 로고
    • Topical locality in the web
    • B. D. Davison. Topical locality in the web. In Proceedings of SIGIR '00, pages 272-279, 2000.
    • (2000) Proceedings of SIGIR '00 , pp. 272-279
    • Davison, B.D.1
  • 11
    • 0033293618 scopus 로고    scopus 로고
    • Finding related pages in the world wide web
    • J. Dean and M. R. Henzinger. Finding Related Pages in the World Wide Web. Computer Networks, 31(11-16):1467-1479, 1999.
    • (1999) Computer Networks , vol.31 , Issue.11-16 , pp. 1467-1479
    • Dean, J.1    Henzinger, M.R.2
  • 14
    • 84871039639 scopus 로고    scopus 로고
    • Inferring query performance using pre-retrieval predictors
    • B. He and I. Ounis. Inferring Query Performance Using Pre-retrieval Predictors. In Proceedings of SPIRE '04, pages 43-54, 2004.
    • (2004) Proceedings of SPIRE '04 , pp. 43-54
    • He, B.1    Ounis, I.2
  • 18
    • 0002488516 scopus 로고    scopus 로고
    • Preserving the internet
    • March
    • B. Kahle. Preserving the Internet. Scientific American, 276:82-83, March 1997.
    • (1997) Scientific American , vol.276 , pp. 82-83
    • Kahle, B.1
  • 19
    • 0344154400 scopus 로고    scopus 로고
    • Using the web to obtain frequencies for unseen bigrams
    • F. Keller and M. Lapata. Using the Web to Obtain Frequencies for Unseen Bigrams. Computational Linguistics, 29(3):459-484, 2003.
    • (2003) Computational Linguistics , vol.29 , Issue.3 , pp. 459-484
    • Keller, F.1    Lapata, M.2
  • 20
    • 77951106747 scopus 로고    scopus 로고
    • A comparison of techniques for estimating IDF values to generate lexical signatures for the web
    • M. Klein and M. L. Nelson. A Comparison of Techniques for Estimating IDF Values to Generate Lexical Signatures for the Web. In Proceedings of WIDM '08, pages 39-46, 2008.
    • (2008) Proceedings of WIDM '08 , pp. 39-46
    • Klein, M.1    Nelson, M.L.2
  • 21
    • 55249110614 scopus 로고    scopus 로고
    • Revisiting lexical signatures to (Re-)Discover web pages
    • M. Klein and M. L. Nelson. Revisiting Lexical Signatures to (Re-)Discover Web Pages. In Proceedings of ECDL '08, pages 371-382, 2008.
    • (2008) Proceedings of ECDL '08 , pp. 371-382
    • Klein, M.1    Nelson, M.L.2
  • 22
    • 70450230376 scopus 로고    scopus 로고
    • Inter-search engine lexical signature performance
    • M. Klein and M. L. Nelson. Inter-Search Engine Lexical Signature Performance. In Proceedings of JCDL '09, pages 413-414, 2009.
    • (2009) Proceedings of JCDL '09 , pp. 413-414
    • Klein, M.1    Nelson, M.L.2
  • 25
    • 42149169631 scopus 로고    scopus 로고
    • Evaluating personal archiving strategies for internet-based information
    • C. C. Marshall, F. McCown, and M. L. Nelson. Evaluating Personal Archiving Strategies for Internet-based Information. In Proceedings of IS&T Archiving '07, pages 48-52, 2007.
    • (2007) Proceedings of IS&T Archiving '07 , pp. 48-52
    • Marshall, C.C.1    McCown, F.2    Nelson, M.L.3
  • 27
    • 36049051024 scopus 로고    scopus 로고
    • Characterization of search engine caches
    • (Also available as arXiv:cs/0703083v2)
    • F. McCown and M. L. Nelson. Characterization of Search Engine Caches. In Proceedings of IS&T Archiving '07, pages 48-52, 2007. (Also available as arXiv:cs/0703083v2).
    • (2007) Proceedings of IS&T Archiving '07 , pp. 48-52
    • McCown, F.1    Nelson, M.L.2
  • 28
    • 34547317670 scopus 로고    scopus 로고
    • Lazy preservation: Reconstructing websites by crawling the crawlers
    • F. McCown, J. A. Smith, and M. L. Nelson. Lazy Preservation: Reconstructing Websites by Crawling the Crawlers. In Proceedings of WIDM '06, pages 67-74, 2006.
    • (2006) Proceedings of WIDM '06 , pp. 67-74
    • McCown, F.1    Smith, J.A.2    Nelson, M.L.3
  • 29
    • 34547233284 scopus 로고    scopus 로고
    • Using the web infrastructure to preserve web pages
    • M. L. Nelson, F. McCown, J. A. Smith, and M. Klein. Using the Web Infrastructure to Preserve Web Pages. IJDL, 6(4):327-349, 2007.
    • (2007) IJDL , vol.6 , Issue.4 , pp. 327-349
    • Nelson, M.L.1    McCown, F.2    Smith, J.A.3    Klein, M.4
  • 30
    • 9144269133 scopus 로고    scopus 로고
    • Analysis of lexical signatures for improving information persistence on the world wide web
    • S.-T. Park, D. M. Pennock, C. L. Giles, and R. Krovetz. Analysis of Lexical Signatures for Improving Information Persistence on the World Wide Web. ACM Transactions on Information Systems, 22(4):540-572, 2004.
    • (2004) ACM Transactions on Information Systems , vol.22 , Issue.4 , pp. 540-572
    • Park, S.-T.1    Pennock, D.M.2    Giles, C.L.3    Krovetz, R.4
  • 31
    • 0004013458 scopus 로고    scopus 로고
    • Robust hyperlinks cost just five words each
    • University of California at Berkeley, Berkeley, CA, USA
    • T. A. Phelps and R. Wilensky. Robust Hyperlinks Cost Just Five Words Each. Technical Report UCB//CSD-00-1091, University of California at Berkeley, Berkeley, CA, USA, 2000.
    • (2000) Technical Report UCB//CSD-00-1091
    • Phelps, T.A.1    Wilensky, R.2
  • 33
    • 0016572913 scopus 로고
    • A vector space model for automatic indexing
    • G. Salton, A. Wong, and C. S. Yang. A Vector Space Model for Automatic Indexing. Communications of the ACM, 18(11):613-620, 1975.
    • (1975) Communications of the ACM , vol.18 , Issue.11 , pp. 613-620
    • Salton, G.1    Wong, A.2    Yang, C.S.3
  • 35
    • 1342315836 scopus 로고    scopus 로고
    • The decay and failures of web references
    • D. Spinellis. The decay and failures of web references. Communications of the ACM, 46(1):71-77, 2003.
    • (2003) Communications of the ACM , vol.46 , Issue.1 , pp. 71-77
    • Spinellis, D.1
  • 36
    • 1142293071 scopus 로고    scopus 로고
    • Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages
    • K. Sugiyama, K. Hatano, M. Yoshikawa, and S. Uemura. Refinement of TF-IDF Schemes for Web Pages using their Hyperlinked Neighboring Pages. In Proceedings of HYPERTEXT '03, pages 198-207, 2003.
    • (2003) Proceedings of HYPERTEXT '03 , pp. 198-207
    • Sugiyama, K.1    Hatano, K.2    Yoshikawa, M.3    Uemura, S.4
  • 38
    • 33745644751 scopus 로고    scopus 로고
    • Wordrank-based lexical signatures for finding lost or related web pages
    • X. Wan and J. Yang. Wordrank-based Lexical Signatures for Finding Lost or Related Web Pages. In A P Web, pages 843-849, 2006.
    • (2006) A P Web , pp. 843-849
    • Wan, X.1    Yang, J.2
  • 39
    • 0034852836 scopus 로고    scopus 로고
    • Improving trigram language modeling with the world wide web
    • X. Zhu and R. Rosenfeld. Improving Trigram Language Modeling with the World Wide Web. In Proceedings of ICASSP '01, pages 533-536, 2001.
    • (2001) Proceedings of ICASSP '01 , pp. 533-536
    • Zhu, X.1    Rosenfeld, R.2
  • 40
    • 23844539573 scopus 로고    scopus 로고
    • What's there and what's not?: Focused crawling for missing documents in digital libraries
    • Z. Zhuang, R. Wagle, and C. L. Giles. What's There and What's Not?: Focused Crawling for Missing Documents in Digital Libraries. In Proceedings of JCDL '05, pages 301-310, 2005.
    • (2005) Proceedings of JCDL '05 , pp. 301-310
    • Zhuang, Z.1    Wagle, R.2    Giles, C.L.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.