메뉴 건너뛰기




Volumn 31, Issue 7, 2006, Pages 595-609

Accurate discovery of co-derivative documents via duplicate text detection

Author keywords

Duplicate detection; Fingerprinting; Hashing

Indexed keywords

ALGORITHMS; DATA ACQUISITION; FEATURE EXTRACTION; INDEXING (OF INFORMATION); PATTERN RECOGNITION;

EID: 33747171448     PISSN: 03064379     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.is.2005.11.006     Document Type: Article
Times cited : (14)

References (24)
  • 1
    • 0031346696 scopus 로고    scopus 로고
    • A.Z. Broder, On the resemblance and containment of documents, in: Compression and Complexity of Sequences (SEQUENCES'97)', 1997, pp. 21-29.
  • 2
    • 33747183229 scopus 로고    scopus 로고
    • M. Sanderson, Duplicate detection in the Reuters collection, Technical Report TR-1997-5, University of Glasgow, 1997.
  • 3
    • 33747160317 scopus 로고    scopus 로고
    • N. Shivakumar, H. García-Molina, SCAM: a copy detection mechanism for digital documents, in: Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries, 1995.
  • 5
    • 85043988965 scopus 로고    scopus 로고
    • U. Manber, Finding similar files in a large file system, in: Proceedings of the USENIX Winter 1994 Technical Conference, 1994, pp. 1-10.
  • 6
    • 85198675139 scopus 로고    scopus 로고
    • S. Brin, J. Davis, H. García-Molina, Copy detection mechanisms for digital documents, in: Proceedings of the ACM SIGMOD Annual Conference, 1995, pp. 398-409.
  • 7
    • 33747175257 scopus 로고    scopus 로고
    • N. Heintze, Scalable document fingerprinting, in: 1996 USENIX Workshop on Electronic Commerce, 1996.
  • 12
    • 33747202054 scopus 로고    scopus 로고
    • N. Shivakumar, H. García-Molina, Finding near-replicas of documents on the web, in: WEBDB, International Workshop on the World Wide Web and Databases, WebDB, Springer, Berlin, 1999.
  • 13
    • 0000523223 scopus 로고    scopus 로고
    • Compression and explanation using hierarchical grammars
    • Nevill-Manning C.G., and Witten I.H. Compression and explanation using hierarchical grammars. The Computer Journal 40 2/3 (1997) 103-116
    • (1997) The Computer Journal , vol.40 , Issue.2-3 , pp. 103-116
    • Nevill-Manning, C.G.1    Witten, I.H.2
  • 14
    • 19944392360 scopus 로고    scopus 로고
    • Offline dictionary-based compression
    • Larsson N.J., and Moffat A. Offline dictionary-based compression. Proc. IEEE 88 11 (2000) 1722-1732
    • (2000) Proc. IEEE , vol.88 , Issue.11 , pp. 1722-1732
    • Larsson, N.J.1    Moffat, A.2
  • 18
  • 20
    • 33747150903 scopus 로고    scopus 로고
    • R. Rivest, The MD5 Message-Digest Algorithm, RFC 1321, 1992.
  • 21
    • 0000526256 scopus 로고
    • Overview of the second text retrieval conference (TREC-2)
    • Harman D. Overview of the second text retrieval conference (TREC-2). Information Processing and Management 31 3 (1995) 271-289
    • (1995) Information Processing and Management , vol.31 , Issue.3 , pp. 271-289
    • Harman, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.