메뉴 건너뛰기




Volumn 16, Issue 10, 2004, Pages 1279-1296

Efficient phrase-based document indexing for web document clustering

Author keywords

Document clustering; Document index graph; Document similarity; Document structure; Phrase matching; Phrase based indexing; Web mining

Indexed keywords

ALGORITHMS; CHARACTER RECOGNITION; DATA MINING; DATABASE SYSTEMS; HTML; INFORMATION RETRIEVAL SYSTEMS; MATHEMATICAL MODELS; NATURAL LANGUAGE PROCESSING SYSTEMS; SEARCH ENGINES; TREES (MATHEMATICS); WORLD WIDE WEB;

EID: 13844267502     PISSN: 10414347     EISSN: None     Source Type: Journal    
DOI: 10.1109/TKDE.2004.58     Document Type: Article
Times cited : (267)

References (45)
  • 4
    • 0033294891 scopus 로고    scopus 로고
    • Grouper: A dynamic clustering interface to web search results
    • O. Zamir and O. Etzioni, "Grouper: A Dynamic Clustering Interface to Web Search Results," Computer Networks, vol. 31, nos. 11-16, pp. 1361-1374, 1999.
    • (1999) Computer Networks , vol.31 , Issue.11-16 , pp. 1361-1374
    • Zamir, O.1    Etzioni, O.2
  • 10
    • 84880663504 scopus 로고    scopus 로고
    • The cluster-abstraction model: Unsupervised learning of topic hierarchies from text data
    • T. Hofmann, "The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data," Proc. 16th Int'l Joint Conf. Artificial Intelligence (IJCAI-99), pp. 682-687, 1999.
    • (1999) Proc. 16th Int'l Joint Conf. Artificial Intelligence (IJCAI-99) , pp. 682-687
    • Hofmann, T.1
  • 13
    • 9444257747 scopus 로고    scopus 로고
    • Learning for text categorization and information extraction with ILP
    • J. Cussens, ed.
    • M. Junker, M. Sintek, and M. Rinck, "Learning for Text Categorization and Information Extraction with ILP," Proc. First Workshop Learning Language in Logic, J. Cussens, ed., pp. 84-93, 1999.
    • (1999) Proc. First Workshop Learning Language in Logic , pp. 84-93
    • Junker, M.1    Sintek, M.2    Rinck, M.3
  • 15
    • 0032624184 scopus 로고    scopus 로고
    • Learning information extraction rules for semi-structured and free text
    • S. Soderland, "Learning Information Extraction Rules for Semi-Structured and Free Text," Machine Learning, vol. 34, nos. 1-3, pp. 233-272, 1999.
    • (1999) Machine Learning , vol.34 , Issue.1-3 , pp. 233-272
    • Soderland, S.1
  • 16
    • 0009878181 scopus 로고    scopus 로고
    • Text categorisation: A survey
    • Norwegian Computing Center, June
    • K. Aas and L. Eikvil, "Text Categorisation: A Survey," Technical Report 941, Norwegian Computing Center, June 1999.
    • (1999) Technical Report , vol.941
    • Aas, K.1    Eikvil, L.2
  • 17
    • 0016572913 scopus 로고
    • A vector space model for automatic indexing
    • Nov.
    • G. Salton, A. Wong, and C. Yang, "A Vector Space Model for Automatic Indexing," Comm. ACM, vol. 18, no. 11, pp. 613-620, Nov. 1975.
    • (1975) Comm. ACM , vol.18 , Issue.11 , pp. 613-620
    • Salton, G.1    Wong, A.2    Yang, C.3
  • 22
    • 84948481845 scopus 로고
    • An algorithm for suffix stripping
    • July
    • M.F. Porter, "An Algorithm for Suffix Stripping," Program, vol. 14, no. 3, pp. 130-137, July 1980.
    • (1980) Program , vol.14 , Issue.3 , pp. 130-137
    • Porter, M.F.1
  • 23
    • 0033227559 scopus 로고    scopus 로고
    • Reducing the space requirement of suffix trees
    • S. Kurtz, "Reducing the Space Requirement of Suffix Trees," Software - Practice and Experience, vol. 29, no. 13, pp. 1149-1171, 1999.
    • (1999) Software - Practice and Experience , vol.29 , Issue.13 , pp. 1149-1171
    • Kurtz, S.1
  • 24
    • 0002139526 scopus 로고
    • The myriad virtues of subword trees
    • A. Apostolico and Z. Galil, eds., (NATO ISI Series)
    • A. Apostolico, "The Myriad Virtues of Subword Trees," Combinatorial Algorithms on Words, A. Apostolico and Z. Galil, eds., pp. 85-96, (NATO ISI Series), 1985.
    • (1985) Combinatorial Algorithms on Words , pp. 85-96
    • Apostolico, A.1
  • 25
    • 0027681165 scopus 로고
    • Suffix arrays: A new method for on-line string searches
    • U. Manber and G. Myers, "Suffix Arrays: A New Method for On-Line String Searches," SIAM J. Computing, vol. 22, no. 5, pp. 935-948, 1993.
    • (1993) SIAM J. Computing , vol.22 , Issue.5 , pp. 935-948
    • Manber, U.1    Myers, G.2
  • 27
    • 4243730261 scopus 로고    scopus 로고
    • Statistical phrases in automated text categorization
    • Pisa, Italy
    • M.F. Caropreso, S. Matwin, and F. Sebastiani, "Statistical Phrases in Automated Text Categorization," Technical Report IEI-B4-07-2000, Pisa, Italy, 2000.
    • (2000) Technical Report , vol.IEI-B4-07-2000
    • Caropreso, M.F.1    Matwin, S.2    Sebastiani, F.3
  • 28
    • 13844286802 scopus 로고    scopus 로고
    • Investigating measures for pairwise document similarity
    • Dartmouth College, Computer Science, Hanover, N.H., June
    • J.D. Isaacs and J.A. Aslam, "Investigating Measures for Pairwise Document Similarity," Technical Report PCS-TR99-357, Dartmouth College, Computer Science, Hanover, N.H., June 1999.
    • (1999) Technical Report , vol.PCS-TR99-357
    • Isaacs, J.D.1    Aslam, J.A.2
  • 29
    • 0005180705 scopus 로고    scopus 로고
    • An information-theoretic definition of similarity
    • D. Lin, "An Information-Theoretic Definition of Similarity," Proc. 15th Int'l Conf. Machine Learning, pp. 296-304, 1998.
    • (1998) Proc. 15th Int'l Conf. Machine Learning , pp. 296-304
    • Lin, D.1
  • 40
    • 0017969553 scopus 로고
    • A sentence-to-sentence clustering procedure for pattern analysis
    • S.Y. Lu and K.S. Fu, "A Sentence-to-Sentence Clustering Procedure for Pattern Analysis," IEEE Trans. Systems, Man, and Cybernetics, vol. 8, pp. 381-389, 1978.
    • (1978) IEEE Trans. Systems, Man, and Cybernetics , vol.8 , pp. 381-389
    • Lu, S.Y.1    Fu, K.S.2
  • 43
    • 22644451496 scopus 로고    scopus 로고
    • Principal direction divisive partitioning
    • D. Boley, "Principal Direction Divisive Partitioning," Data Mining and Knowledge Discovery, vol. 2, no. 4, pp. 325-344, 1998.
    • (1998) Data Mining and Knowledge Discovery , vol.2 , Issue.4 , pp. 325-344
    • Boley, D.1
  • 45
    • 0033315817 scopus 로고    scopus 로고
    • Document categorization and query generation on the world wide web using webACE
    • D. Boley, M. Gini, R. Gross, S. Han, K. Hastings, G. Karypis, V. Kumar, B. Mobasher, and J. Moore, "Document Categorization and Query Generation on the World Wide Web Using WebACE," AI Rev., vol. 13, nos. 5-6, pp. 365-391, 1999.
    • (1999) AI Rev. , vol.13 , Issue.5-6 , pp. 365-391
    • Boley, D.1    Gini, M.2    Gross, R.3    Han, S.4    Hastings, K.5    Karypis, G.6    Kumar, V.7    Mobasher, B.8    Moore, J.9


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.