메뉴 건너뛰기




Volumn 4209 LNCS, Issue , 2006, Pages 110-121

Compact features for detection of near-duplicates in distributed retrieval

Author keywords

[No Author keywords available]

Indexed keywords

BANDWIDTH; COMPUTATIONAL METHODS; DISTRIBUTED COMPUTER SYSTEMS; INDEXING (OF INFORMATION); INFORMATION RETRIEVAL; QUERY LANGUAGES;

EID: 33750314850     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/11880561_10     Document Type: Conference Paper
Times cited : (14)

References (31)
  • 1
    • 10044242279 scopus 로고    scopus 로고
    • Challenges in information retrieval and language modeling: Report of a workshop held at the center for intelligent information retrieval
    • University of Massachusetts Amherst, September 2002
    • J. Allan et al. Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002. SIGIR Forum, 37(1):31-47, 2003.
    • (2003) SIGIR Forum , vol.37 , Issue.1 , pp. 31-47
    • Allan, J.1
  • 3
    • 33745784229 scopus 로고    scopus 로고
    • Redundant documents and search effectiveness
    • Bremen, Germany
    • Y. Bernstein and J. Zobel. Redundant documents and search effectiveness. In Proc. ACM CIKM Conf., pages 736-743, Bremen, Germany, 2005.
    • (2005) Proc. ACM CIKM Conf. , pp. 736-743
    • Bernstein, Y.1    Zobel, J.2
  • 8
    • 0029193309 scopus 로고
    • Searching distributed collections with inference networks
    • Seattle, Washington
    • J. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. In Proc. Int. ACM-SIGIR Conf., pages 21-28, Seattle, Washington, 1995.
    • (1995) Proc. Int. ACM-SIGIR Conf. , pp. 21-28
    • Callan, J.1    Lu, Z.2    Croft, W.B.3
  • 10
    • 12244271239 scopus 로고    scopus 로고
    • Online duplicate document detection: Signature reliability in a dynamic retrieval environment
    • New Orleans, Louisiana
    • J. G. Conrad, X. S. Guo, and C. P. Schriber. Online duplicate document detection: Signature reliability in a dynamic retrieval environment. In Proc. ACM-CIKM Conf., pages 443-452, New Orleans, Louisiana, 2003.
    • (2003) Proc. ACM-CIKM Conf. , pp. 443-452
    • Conrad, J.G.1    Guo, X.S.2    Schriber, C.P.3
  • 11
    • 0037481029 scopus 로고    scopus 로고
    • Detecting similar documents using salient terms
    • McLean, Virginia
    • J. W. Cooper, A. R. Coden, and E. W. Brown. Detecting similar documents using salient terms. In Proc. ACM-CIKM Conf., pages 245-251, McLean, Virginia, 2002.
    • (2002) Proc. ACM-CIKM Conf. , pp. 245-251
    • Cooper, J.W.1    Coden, A.R.2    Brown, E.W.3
  • 13
    • 33947178503 scopus 로고    scopus 로고
    • ProFusion: Intelligent fusion from multiple, distributed search engines
    • S. Gauch, G. Wang, and M. Gomez. ProFusion: Intelligent fusion from multiple, distributed search engines. J. Universal Computer Science, 2(9):637-649, 1996.
    • (1996) J. Universal Computer Science , vol.2 , Issue.9 , pp. 637-649
    • Gauch, S.1    Wang, G.2    Gomez, M.3
  • 15
    • 0027727494 scopus 로고
    • Overview of the first TREC conference
    • Pittsburgh, Pennsylvania
    • D. Harman. Overview of the first TREC conference. In Proc. ACM-SIGIR Conf., pages 36-47, Pittsburgh, Pennsylvania, 1993.
    • (1993) Proc. ACM-SIGIR Conf. , pp. 36-47
    • Harman, D.1
  • 16
    • 77953053895 scopus 로고    scopus 로고
    • Improving text collection selection with coverage and overlap statistics
    • Chiba, Japan
    • T. Hernandez and S. Kambhampati. Improving text collection selection with coverage and overlap statistics. In Proc. Int. Conf. on World Wide Web, pages 1128-1129, Chiba, Japan, 2005.
    • (2005) Proc. Int. Conf. on World Wide Web , pp. 1128-1129
    • Hernandez, T.1    Kambhampati, S.2
  • 18
    • 12244302670 scopus 로고    scopus 로고
    • An efficient method to detect duplicates of web documents with the use of inverted index
    • Honolulu, Hawaii
    • S. Ilyinski, M. Kuzmin, A. Melkov, and I. Segalovich. An efficient method to detect duplicates of web documents with the use of inverted index. In Proc, Int. Conf. on World Wide Web, Honolulu, Hawaii, 2002.
    • (2002) Proc, Int. Conf. on World Wide Web
    • Ilyinski, S.1    Kuzmin, M.2    Melkov, A.3    Segalovich, I.4
  • 21
    • 85043988965 scopus 로고
    • Finding similar files in a large file system
    • San Fransisco, CA, 17-21
    • U. Manber. Finding similar files in a large file system. In Proc. USENIX Winter Technical Conf., pages 1-10, San Fransisco, CA, 17-21 1994.
    • (1994) Proc. USENIX Winter Technical Conf. , pp. 1-10
    • Manber, U.1
  • 22
    • 0038544393 scopus 로고    scopus 로고
    • Building efficient and effective metasearch engines
    • W. Meng, C. Yu, and K. Liu. Building efficient and effective metasearch engines. ACM Computing Surveys, 34(1):48-89, 2002.
    • (2002) ACM Computing Surveys , vol.34 , Issue.1 , pp. 48-89
    • Meng, W.1    Yu, C.2    Liu, K.3
  • 23
    • 1542317683 scopus 로고    scopus 로고
    • Evaluating different methods of estimating retrieval quality for resource selection
    • Toronto, Canada
    • H. Nottelmann and N. Fuhr. Evaluating different methods of estimating retrieval quality for resource selection. In Proc. Int. ACM-SIGIR Conf., pages 290-297, Toronto, Canada, 2003.
    • (2003) Proc. Int. ACM-SIGIR Conf. , pp. 290-297
    • Nottelmann, H.1    Fuhr, N.2
  • 24
    • 2442500342 scopus 로고    scopus 로고
    • Comparing the performance of collection selection algorithms
    • A. L. Powell and J. French. Comparing the performance of collection selection algorithms. ACM Transactions on Information Systems, 21(4):412-456, 2003.
    • (2003) ACM Transactions on Information Systems , vol.21 , Issue.4 , pp. 412-456
    • Powell, A.L.1    French, J.2
  • 25
    • 33750297403 scopus 로고    scopus 로고
    • Detecting duplicate and near-duplicate files
    • United States Patent 6,658,423
    • W. Pugh and M. H. Henzinger. Detecting duplicate and near-duplicate files (United States Patent 6,658,423), 2003.
    • (2003)
    • Pugh, W.1    Henzinger, M.H.2
  • 26
    • 33748731480 scopus 로고    scopus 로고
    • The MetaCrawler architecture for resource aggregation on the Web
    • E. Selberg and O. Etzioni. The MetaCrawler architecture for resource aggregation on the Web. IEEE Expert, (January-February): 11-14, 1997.
    • (1997) IEEE Expert , Issue.JANUARY-FEBRUARY , pp. 11-14
    • Selberg, E.1    Etzioni, O.2
  • 27
    • 18744392825 scopus 로고    scopus 로고
    • Unified utility maximization framework for resource selection
    • Washington, D.C.
    • L. Si and J. Callan. Unified utility maximization framework for resource selection. In Proc. ACM-CIKM Conf., pages 32-41, Washington, D.C., 2004.
    • (2004) Proc. ACM-CIKM Conf. , pp. 32-41
    • Si, L.1    Callan, J.2
  • 28
    • 1542347745 scopus 로고    scopus 로고
    • Relevant document distribution estimation method for resource selection
    • Toronto, Canada
    • L. Si and J. Callan. Relevant document distribution estimation method for resource selection. In Proc. ACM-SIGIR Conf., pages 298-305, Toronto, Canada, 2003.
    • (2003) Proc. ACM-SIGIR Conf. , pp. 298-305
    • Si, L.1    Callan, J.2
  • 30
    • 0033294891 scopus 로고    scopus 로고
    • Grouper: A dynamic clustering interface to web search results
    • Toronto, Canada
    • O. Zamir and O. Etzioni. Grouper: a dynamic clustering interface to web search results. In Proc. Int. Conf. on World Wide Web, pages 1361-1374, Toronto, Canada, 1999.
    • (1999) Proc. Int. Conf. on World Wide Web , pp. 1361-1374
    • Zamir, O.1    Etzioni, O.2
  • 31
    • 33745661252 scopus 로고    scopus 로고
    • The case of the duplicate documents: Measurement, search, and science
    • Harbin, China
    • J. Zobel and Y. Bernstein. The case of the duplicate documents: Measurement, search, and science. In Proc. Asia-Pacific Web Con}., pages 26-39, Harbin, China, 2006.
    • (2006) Proc. Asia-pacific Web Con}. , pp. 26-39
    • Zobel, J.1    Bernstein, Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.