메뉴 건너뛰기




Volumn 2, Issue 4, 2000, Pages 186-199

String techniques for detecting duplicates in document databases

Author keywords

Approximate string matching; Document analysis; Duplicate detection; Information retrieval; Optical character recognition

Indexed keywords

CHARACTER RECOGNITION; INFORMATION RETRIEVAL;

EID: 0042257703     PISSN: 14332833     EISSN: 14332825     Source Type: Journal    
DOI: 10.1007/s100320050005     Document Type: Article
Times cited : (8)

References (34)
  • 2
    • 25844528603 scopus 로고
    • A fast algorithm for finding the nearest neighbor of a word in a dictionary
    • In, Tsukuba Science City, Japan, October
    • H. Bunke. A fast algorithm for finding the nearest neighbor of a word in a dictionary. In: Proc. 2nd Int. Conf. on Doc. Anal. and Recognition, pp. 632-637, Tsukuba Science City, Japan, October 1993
    • (1993) Proc. 2nd Int. Conf. on Doc. Anal. and Recognition , pp. 632-637
    • Bunke, H.1
  • 4
    • 0028485378 scopus 로고
    • An approach to designing very fast approximate string matching algorithms
    • M.-W. Du, S. C. Chang. An approach to designing very fast approximate string matching algorithms. IEEE Trans. on Knowl. and Data Eng. 6(4): 620-633 (1994)
    • (1994) IEEE Trans. on Knowl. and Data Eng , vol.6 , Issue.4 , pp. 620-633
    • Du, M.-W.1    Chang, S.C.2
  • 6
    • 0021760002 scopus 로고
    • Fast optimal alignment
    • J. W. Fickett. Fast optimal alignment. Nucleic Acids Research 12(1): 175-179 (1984)
    • (1984) Nucleic Acids Research , vol.12 , Issue.1 , pp. 175-179
    • Fickett, J.W.1
  • 9
    • 84892743207 scopus 로고    scopus 로고
    • GulfLink
    • GulfLink. http://www.gulflink.osd.mil/
  • 11
    • 0042258316 scopus 로고    scopus 로고
    • Document image similarity and equivalence detection
    • J. J. Hull. Document image similarity and equivalence detection. Int. J. Doc. Anal. and Recognition 1(1): 37-42 (1998)
    • (1998) Int. J. Doc. Anal. and Recognition , vol.1 , Issue.1 , pp. 37-42
    • Hull, J.J.1
  • 13
    • 0006513656 scopus 로고    scopus 로고
    • Duplicate detection for symbolically compressed documents
    • In, Bangalore, India, September
    • D.-S. Lee, J. J. Hull. Duplicate detection for symbolically compressed documents. In: Proc. 5th Int. Conf. on Doc. Anal. and Recognition, pp. 305-308, Bangalore, India, September 1999
    • (1999) Proc. 5th Int. Conf. on Doc. Anal. and Recognition , pp. 305-308
    • Lee, D.-S.1    Hull, J.J.2
  • 14
    • 0001116877 scopus 로고
    • Binary codes capable of correcting deletions, insertions, and reversals
    • V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Cybernetics and Control Theory 10(8): 707-710 (1966)
    • (1966) Cybernetics and Control Theory , vol.10 , Issue.8 , pp. 707-710
    • Levenshtein, V.I.1
  • 16
    • 0041757621 scopus 로고    scopus 로고
    • Models and algorithms for duplicate document detection
    • In, Bangalore, India, September
    • D. P. Lopresti. Models and algorithms for duplicate document detection. In: Proc. 5th Int. Conf. on Doc. Anal. and Recognition, pp. 297-300, Bangalore, India, September 1999
    • (1999) Proc. 5th Int. Conf. on Doc. Anal. and Recognition , pp. 297-300
    • Lopresti, D.P.1
  • 17
    • 0043260683 scopus 로고    scopus 로고
    • String techniques for duplicate document detection
    • In, Annapolis, MD, April
    • D. P. Lopresti. String techniques for duplicate document detection. In: Proc. Symp. on Doc. Image Understanding Technol., pp. 101-112, Annapolis, MD, April 1999
    • (1999) Proc. Symp. on Doc. Image Understanding Technol , pp. 101-112
    • Lopresti, D.P.1
  • 18
    • 0033908952 scopus 로고    scopus 로고
    • A comparison of text-based methods for detecting duplication in document image databases
    • In, January, CA, San Jose
    • D. P. Lopresti. A comparison of text-based methods for detecting duplication in document image databases. In: Proc. Doc. Recognition and Retrieval VII (IS&T/SPIE Electronic Imaging), 3967: 210-221, San Jose, CA, January 2000
    • (2000) Proc. Doc. Recognition and Retrieval VII (IS&T/SPIE Electronic Imaging) , vol.3967 , pp. 210-221
    • Lopresti, D.P.1
  • 19
    • 0031187745 scopus 로고    scopus 로고
    • Block edit models for approximate string matching
    • D. P. Lopresti, A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science (181): 159-179 (1997)
    • (1997) Theoretical Computer Science , Issue.181 , pp. 159-179
    • Lopresti, D.P.1    Tomkins, A.2
  • 20
    • 85043988965 scopus 로고
    • Finding similar files in a large file system
    • In, San Francisco, CA, January
    • U. Manber. Finding similar files in a large file system. In: Proc. USENIX, pp. 1-10, San Francisco, CA, January 1994
    • (1994) Proc. USENIX , pp. 1-10
    • Manber, U.1
  • 21
    • 0014757386 scopus 로고
    • A general method applicable to the search for similarities in the amino-acid sequences of two proteins
    • S. B. Needleman, C. D. Wunsch. A general method applicable to the search for similarities in the amino-acid sequences of two proteins. J. Mol. Biol. 48:443-453 (1970)
    • (1970) J. Mol. Biol , vol.48 , pp. 443-453
    • Needleman, S.B.1    Wunsch, C.D.2
  • 22
    • 0042758967 scopus 로고    scopus 로고
    • Database partitioning and duplicate document detection based on optical correlation
    • In, Annapolis, MD, April
    • F. Prokoski. Database partitioning and duplicate document detection based on optical correlation. In: Proc. Symp. on Doc. Image Understanding Technol., pp. 86-97, Annapolis, MD, April 1999
    • (1999) Proc. Symp. on Doc. Image Understanding Technol , pp. 86-97
    • Prokoski, F.1
  • 26
    • 49149141669 scopus 로고
    • The theory and computation of evolutionary distances: Pattern recognition
    • P. H. Sellers. The theory and computation of evolutionary distances: pattern recognition. J. Algorithms 1: 359-373 (1980)
    • (1980) J. Algorithms , vol.1 , pp. 359-373
    • Sellers, P.H.1
  • 29
    • 0019887799 scopus 로고
    • identification of common molecular sequences
    • T. F. Smith, M. S. Waterman. identification of common molecular sequences. J. Mol. Biol. 147: 195-197 (1981)
    • (1981) J. Mol. Biol , vol.147 , pp. 195-197
    • Smith, T.F.1    Waterman, M.S.2
  • 33
    • 84983986619 scopus 로고
    • On approximate string matching
    • In, LNCS 158, Springer, Berlin Heidelberg New York
    • E. Ukkonen. On approximate string matching. In: Proc. Int. Conf. on Foundations of Comput. Theory, LNCS 158, pp. 487-493. Springer, Berlin Heidelberg New York, 1983
    • (1983) Proc. Int. Conf. on Foundations of Comput. Theory , pp. 487-493
    • Ukkonen, E.1
  • 34
    • 0015960104 scopus 로고
    • The string-to-string correction problem
    • R. A. Wagner, M. J. Fischer. The string-to-string correction problem. J. ACM 21: 168-173 (1974)
    • (1974) J. ACM , vol.21 , pp. 168-173
    • Wagner, R.A.1    Fischer, M.J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.