메뉴 건너뛰기




Volumn , Issue , 2009, Pages 679-684

Duplicate detection in documents and webpages using improved longest common subsequence and documents syntactical structures

Author keywords

Component: part of speech; Duplication filtering; Longest common subsequence; Syntactical structure

Indexed keywords

DUPLICATE DETECTION; LONGEST COMMON SUBSEQUENCES; PART OF SPEECH; SEARCH ENGINE RESULTS; SYNTACTICAL STRUCTURES; WEB PAGE;

EID: 77749301855     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICCIT.2009.235     Document Type: Conference Paper
Times cited : (23)

References (38)
  • 3
    • 77749264172 scopus 로고    scopus 로고
    • Combined Syntactical Structures and Sequence Alignment Approach to Document Similarity Calculation for Copy Detection
    • M.Sc. thesis, Department of Computer Science, Collage of Science, Sultan Qaboos University, Muscat, Oman
    • A. Al-Tobi, "Combined Syntactical Structures and Sequence Alignment Approach to Document Similarity Calculation for Copy Detection", M.Sc. thesis, Department of Computer Science, Collage of Science, Sultan Qaboos University, Muscat, Oman, 2008.
    • (2008)
    • Al-Tobi, A.1
  • 5
    • 84892758238 scopus 로고    scopus 로고
    • CHECK: A document plagiarism detection system
    • San Jose, California, USA, February 28, March 1
    • A. Si, H.V. Leong and R.W.H. Lau, "CHECK: A document plagiarism detection system", in Proceedings of ACM Symposium for Applied Computing, ACM (San Jose, California, USA, February 28 - March 1 1997), 1997, pp. 70-77.
    • (1997) Proceedings of ACM Symposium for Applied Computing, ACM , pp. 70-77
    • Si, A.1    Leong, H.V.2    Lau, R.W.H.3
  • 7
    • 77749237100 scopus 로고    scopus 로고
    • REUTERS, Reuters Corpus (1: English Language, 1996-08-20 to 1997-08-19), Released date: November 2000, NIST, 2000.
    • REUTERS, Reuters Corpus (Volume 1: English Language, 1996-08-20 to 1997-08-19), Released date: November 2000, NIST, 2000.
  • 10
    • 62949125921 scopus 로고    scopus 로고
    • Use of Text Syntactical Structures in Detection of Document Duplicates
    • University of East London, London. UK, November 13-16
    • M. Elhadi and A. Al-Tobi, "Use of Text Syntactical Structures in Detection of Document Duplicates", in Third IEEE International Conference on Digital Information Management (University of East London, London. UK, November 13-16 2008), 2008.
    • (2008) Third IEEE International Conference on Digital Information Management
    • Elhadi, M.1    Al-Tobi, A.2
  • 11
    • 0344756842 scopus 로고    scopus 로고
    • Modern Information Retrieval: A Brief Overview
    • March
    • A. Singhal, "Modern Information Retrieval: A Brief Overview", IEEE Data Engin. Bulletin, Vol. 24, No. 4, pp. 35-43, March 2001.
    • (2001) IEEE Data Engin. Bulletin , vol.24 , Issue.4 , pp. 35-43
    • Singhal, A.1
  • 13
    • 77749273600 scopus 로고    scopus 로고
    • class notes for CSE 591: Computational Molecular Biology, Depar. of Computer Science & Engineering, Arizona State University, Spring
    • "Local Alignment: Smith-Waterman algorithm", class notes for CSE 591: Computational Molecular Biology, Depar. of Computer Science & Engineering, Arizona State University, Spring 2003.
    • (2003) Local Alignment: Smith-Waterman algorithm
  • 14
    • 0010649742 scopus 로고    scopus 로고
    • Second Edition, Oxford University Press Inc, New York, USA
    • A. M. Lesk, Introduction to Bioinformatics, Second Edition, Oxford University Press Inc., New York, USA, 2005.
    • (2005) Introduction to Bioinformatics
    • Lesk, A.M.1
  • 16
    • 77749273607 scopus 로고    scopus 로고
    • Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart, Germany
    • Lexicon and Textcorpora Group
    • Lexicon and Textcorpora Group, Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart, Germany, "TreeTagger - a language independent part-of-speech tagger", 2003, http://www.ims.uni- stuttgart.de/projekte/corplex/TreeTagger.
    • (2003) TreeTagger - a language independent part-of-speech tagger
  • 17
    • 76649139963 scopus 로고    scopus 로고
    • A survey of machine learning approaches to analysis of large corpora
    • SProLaC, Lancaster University, UK, March 28, 31
    • X.R. Hu and E. Atwell, "A survey of machine learning approaches to analysis of large corpora", in Proceedings of the Workshop on Shallow Processing of Large Corpora (SProLaC), (Lancaster University, UK, March 28 - 31 2003), 2003, pp. 45-52.
    • (2003) Proceedings of the Workshop on Shallow Processing of Large Corpora , pp. 45-52
    • Hu, X.R.1    Atwell, E.2
  • 18
    • 0000329809 scopus 로고
    • General Methods of Sequence Comparison
    • M.S. Waterman, "General Methods of Sequence Comparison", Bulletin of Math. Biology, Vol. 46, No. 4, pp. 473-500, 1984.
    • (1984) Bulletin of Math. Biology , vol.46 , Issue.4 , pp. 473-500
    • Waterman, M.S.1
  • 19
    • 77749237095 scopus 로고    scopus 로고
    • Parts of Speech
    • ELC Courses
    • ELC Courses, English Language Centre, University of Victoria, "Parts of Speech", 1997, http://web2.uvcs.uvic.ca/elc/StudyZone/330/grammar/ parts.htm.
    • (1997)
  • 28
    • 76649099372 scopus 로고    scopus 로고
    • University of Ottawa, Accessed: 25th Sep 2008
    • H. MacFadyen, University of Ottawa, "The Parts of Speech", 2007, http://www.arts.uottawa.ca/writcent/hypergrammar/partsp.html, [Accessed: 25th Sep 2008].
    • (2007) The Parts of Speech
    • MacFadyen, H.1
  • 29
    • 77749264173 scopus 로고    scopus 로고
    • Viterbi algorithm
    • Wikipedia®, Wikimedia Foundation, Inc
    • Wikipedia® , Wikimedia Foundation, Inc., "Viterbi algorithm", 8th Sept 2008, http://en.wikipedia.org/wiki/Viterbi-algorithm..
    • (2008) 8th Sept
  • 31
    • 77749237087 scopus 로고    scopus 로고
    • C. J. van RIJSBERGEN, INFORMATION RETRIEVAL, Department of Computing Science, University of Glasgow, London: Butterworths, 1979.
    • C. J. van RIJSBERGEN, INFORMATION RETRIEVAL, Department of Computing Science, University of Glasgow, London: Butterworths, 1979.
  • 35
    • 77749237092 scopus 로고    scopus 로고
    • diff
    • Wikipedia®, Wikimedia Foundation, Inc
    • Wikipedia® , Wikimedia Foundation, Inc., "diff", 25th Sep 2007, http://en.wikipedia.org/wiki/Diff.
    • (2007) 25th Sep
  • 36
    • 77749237084 scopus 로고    scopus 로고
    • Algorithmist, GNU Free Documentation License, Longest Common Subsequence, 23rd Oct 2006, http://www.algorithmist.com/index.php/Longest- Common-Subsequence.
    • Algorithmist, GNU Free Documentation License, "Longest Common Subsequence", 23rd Oct 2006, http://www.algorithmist.com/index.php/Longest- Common-Subsequence.
  • 38
    • 77749237088 scopus 로고    scopus 로고
    • Part-of-speech tagging
    • Wikipedia®, Wikimedia Foundation, Inc
    • Wikipedia® , Wikimedia Foundation, Inc., "Part-of-speech tagging", 8th Sept 2008, http://en.wikipedia.org/wiki/Part-ofspeech- tagging.
    • (2008) 8th Sept


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.