메뉴 건너뛰기




Volumn 8, Issue 12, 2002, Pages

Towards continuous web archiving: First results and an agenda for the future

Author keywords

[No Author keywords available]

Indexed keywords


EID: 1642345851     PISSN: 10829873     EISSN: None     Source Type: Journal    
DOI: 10.1045/december2002-masanes     Document Type: Article
Times cited : (11)

References (16)
  • 3
    • 0039665995 scopus 로고    scopus 로고
    • The Kulturarw3 project - The Royal Swedish Web Archiw3e: An example of 'complete' collection of web pages
    • Jerusalem, Israel, 13-18 August 2000
    • Arvidson, A., Persson, K. & Mannerheim, J. (2000). "The Kulturarw3 Project - the Royal Swedish Web Archiw3e: an example of 'complete' collection of web pages." 66th IFLA Council and General Conference, Jerusalem, Israel, 13-18 August 2000. Available at: .
    • (2000) 66th IFLA Council and General Conference
    • Arvidson, A.1    Persson, K.2    Mannerheim, J.3
  • 4
    • 3042515596 scopus 로고    scopus 로고
    • Editors' interview: The internet archive
    • 15 June 2002
    • Kahle, B. (2002). "Editors' Interview: The Internet Archive." RLG DigiNews, 6 (3), 15 June 2002. Available at: .
    • (2002) RLG DigiNews , vol.6 , Issue.3
    • Kahle, B.1
  • 5
    • 3042604860 scopus 로고    scopus 로고
    • Or the Digital Archive for Chinese Studies (DACHS), Available at:
    • See for example political sites archiving in The Netherlands, Available at: . Or the Digital Archive for Chinese Studies (DACHS), Available at: . We are trying to make an inventory of on-going web archiving project, so you are welcome to send information about the ones of which you are aware.
  • 6
    • 3042613736 scopus 로고    scopus 로고
    • Collecting and preserving the web: Developing and testing the NEDLIB harvester
    • 15 April 2001
    • Hakala, J. (2001). "Collecting and Preserving the Web: Developing and Testing the NEDLIB Harvester." RLG DigiNews, 5 (2), 15 April 2001. Available at: .
    • (2001) RLG DigiNews , vol.5 , Issue.2
    • Hakala, J.1
  • 7
    • 3042511970 scopus 로고    scopus 로고
    • The main discussion list on this topic is web-archive@cru.fr. Information available at: .
  • 9
    • 25944459687 scopus 로고    scopus 로고
    • Archiving the deep web
    • Rome, Italy, 19 September 2002
    • Masanès, J. (2002). "Archiving the deep Web" 2nd ECDL Workshop on Web Archiving, Rome, Italy, 19 September 2002. Available at: .
    • (2002) 2nd ECDL Workshop on Web Archiving
    • Masanès, J.1
  • 10
    • 84923200850 scopus 로고    scopus 로고
    • A first experience in archiving the French web
    • Research and advanced technology for digital libraries: 6th European conference, ECDL 2002, Agosti, M. & Thanos, C., eds., Rome, Italy, September 16-18, 2002. Berlin: Springer, 1-15
    • Abiteboul, S., Cobéna, G., Masanès, J. & Sedrati, G. (2002). "A first experience in archiving the French Web." In: Research and advanced technology for digital libraries: 6th European conference, ECDL 2002, Agosti, M. & Thanos, C., eds., Rome, Italy, September 16-18, 2002. Lecture Notes in Computer Science, 2458. Berlin: Springer, 1-15. Also available at: .
    • (2002) Lecture Notes in Computer Science , vol.2458
    • Abiteboul, S.1    Cobéna, G.2    Masanès, J.3    Sedrati, G.4
  • 11
    • 3042563719 scopus 로고    scopus 로고
    • To give an illustration of possible gains, here are figures extracted from our 'Elections 2002' collection. This collection encompasses 2.200 sites or part of sites related to the presidential and parliamentary elections held in France in 2002. On a sample on these sites, 43 of the most captured ones, we have for April 2.103.360 files for 6 captures which represent 108 GB of data. Among these files only 45,7% are unique files, which represent 56.3% of the total amount of data. This means that more than a half of the crawling capacity and 43.7 of the storage capacity is 'wasted' in this case. It is really beneficial to have crawler able to manage sites changes in this kind of 'continuous' crawl. The small crawler we have used, HTTRACK (see ), is able to do incremental crawl and with a few scripts and a database, it can be used to handle automatic crawl of hundreds of sites.
  • 12
    • 25944468063 scopus 로고    scopus 로고
    • "The BnF's project for web archiving." What's next for digital deposit libraries?
    • Darmstadt, Germany, 8 September 2001
    • Masanès, J. (2001). "The BnF's project for Web archiving." What's next for digital deposit libraries? ECDL Workshop, Darmstadt, Germany, 8 September 2001. Available at: .
    • (2001) ECDL Workshop
    • Masanès, J.1
  • 13
    • 0038589165 scopus 로고    scopus 로고
    • The anatomy of a large-scale hypertextual web search engine
    • Full version published in the proceedings of the 7th International World Wide Web Conference, Brisbane, Australia, 14-18 April 1998
    • Brin, S. & Page, L. (1998). "The Anatomy of a Large-scale Hypertextual Web Search Engine." Computer Networks and ISDN Systems, 30 (1-7), 107-117. Full version published in the proceedings of the 7th International World Wide Web Conference, Brisbane, Australia, 14-18 April 1998. Available at: .
    • (1998) Computer Networks and ISDN Systems , vol.30 , Issue.1-7 , pp. 107-117
    • Brin, S.1    Page, L.2
  • 14
    • 3042660537 scopus 로고    scopus 로고
    • Véronique Berton, Virginie Breton, Dominique Chrishmann, Christine Genin, Loïc Le Bail, Soraya Salah, Jean-Yves Sarazin and Julian Masanès
    • Véronique Berton, Virginie Breton, Dominique Chrishmann, Christine Genin, Loïc Le Bail, Soraya Salah, Jean-Yves Sarazin and Julian Masanès.
  • 16
    • 3042560065 scopus 로고    scopus 로고
    • Thanks to Gregory Cobéna from INRIA for his help on this part
    • Thanks to Gregory Cobéna from INRIA for his help on this part.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.