메뉴 건너뛰기




Volumn 14, Issue 2, 2008, Pages 217-232

A systematic study on parameter correlations in large-scale duplicate document detection

Author keywords

Clustering; Duplicate document detection; Sampling; Shingling

Indexed keywords

SAMPLING; SCALABILITY;

EID: 38649124934     PISSN: 02191377     EISSN: 02193116     Source Type: Journal    
DOI: 10.1007/s10115-007-0071-9     Document Type: Conference Paper
Times cited : (10)

References (26)
  • 17
    • 28044431810 scopus 로고    scopus 로고
    • Web data extraction based on structural similarity
    • Li Z, Ng WK, Sun A (2005) Web data extraction based on structural similarity. Knowl Inf Syst 8(4):438-461
    • (2005) Knowl Inf Syst , vol.8 , Issue.4 , pp. 438-461
    • Li, Z.1    Ng, W.K.2    Sun, A.3
  • 18
    • 38649093187 scopus 로고    scopus 로고
    • Discovering and analyzing World Wide Web collections
    • Mukherjea1 S (2004) Discovering and analyzing World Wide Web collections. Knowl Inf Syst 6(2):230-241
    • (2004) Knowl Inf Syst , vol.6 , Issue.2 , pp. 230-241
    • Mukherjea1, S.1
  • 19
    • 0003676885 scopus 로고
    • Technical Report tr-15-81, Center for Research in Computing Technology, Harvard University
    • Rabin M (1981) Fingerprinting by random polynomials, Technical report tr-15-81, Center for Research in Computing Technology, Harvard University
    • (1981) Fingerprinting By Random Polynomials
    • Rabin, M.1
  • 21
    • 0344892160 scopus 로고    scopus 로고
    • Do TREC Web collections look like the Web?
    • Soboroff I (2002) Do TREC Web collections look like the Web? SIGIR Forum 36(2):23-31
    • (2002) SIGIR Forum , vol.36 , Issue.2 , pp. 23-31
    • Soboroff, I.1
  • 26
    • 24944501423 scopus 로고    scopus 로고
    • Generative model-based document clustering: A comparative study
    • Zhong S, Ghosh J (2005) Generative model-based document clustering: A comparative study. Knowl Inf Syst 8(3):374-384
    • (2005) Knowl Inf Syst , vol.8 , Issue.3 , pp. 374-384
    • Zhong, S.1    Ghosh, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.