메뉴 건너뛰기




Volumn 179, Issue 13, 2009, Pages 2249-2262

Exploiting noun phrases and semantic relationships for text document clustering

Author keywords

Holonymy; Hypernymy; Hyponymy; Meronymy; Noun phrase; Ontology; Text document clustering; WordNet

Indexed keywords

HOLONYMY; HYPERNYMY; HYPONYMY; MERONYMY; NOUN PHRASE; TEXT DOCUMENT CLUSTERING; WORDNET;

EID: 64549105240     PISSN: 00200255     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ins.2009.02.019     Document Type: Article
Times cited : (83)

References (71)
  • 1
    • 0009878181 scopus 로고    scopus 로고
    • Text categorisation: A survey
    • Technical Report, Norwegian Computing Center, June
    • K. Aas, A. Eikvil, Text categorisation: a survey, Technical Report, Norwegian Computing Center, June 1999.
    • (1999)
    • Aas, K.1    Eikvil, A.2
  • 2
    • 0030361955 scopus 로고    scopus 로고
    • Partial parsing via finite-state cascades
    • Abney S. Partial parsing via finite-state cascades. Nat. Lang. Eng. 2 4 (1996) 337-344
    • (1996) Nat. Lang. Eng. , vol.2 , Issue.4 , pp. 337-344
    • Abney, S.1
  • 3
    • 8844285933 scopus 로고    scopus 로고
    • Gene-ontology-based clustering of gene expression data
    • Adryan B., and Schuh R. Gene-ontology-based clustering of gene expression data. Bioinformatics 20 16 (2004) 2851-2852
    • (2004) Bioinformatics , vol.20 , Issue.16 , pp. 2851-2852
    • Adryan, B.1    Schuh, R.2
  • 5
    • 0029546874 scopus 로고
    • Using linear algebra for intelligent information retrieval
    • Berry M.W., Dumais S.T., and O'Brien G.W. Using linear algebra for intelligent information retrieval. SIAM Rev. 37 4 (1995) 573-595
    • (1995) SIAM Rev. , vol.37 , Issue.4 , pp. 573-595
    • Berry, M.W.1    Dumais, S.T.2    O'Brien, G.W.3
  • 6
    • 70349120410 scopus 로고    scopus 로고
    • Accelerating fuzzy clustering
    • 10.1016/ j.ins.2008.09.017
    • Borgelt C. Accelerating fuzzy clustering. Inform. Sci. (2008) 10.1016/ j.ins.2008.09.017
    • (2008) Inform. Sci.
    • Borgelt, C.1
  • 10
    • 0004449398 scopus 로고
    • Three models for the description of language
    • Chomsky N. Three models for the description of language. IEEE Trans. Inform. Theory 2 3 (1956)
    • (1956) IEEE Trans. Inform. Theory , vol.2 , Issue.3
    • Chomsky, N.1
  • 11
    • 34248138823 scopus 로고    scopus 로고
    • The BioPrompt-box: an ontology-based clustering tool for searching in biological databases
    • Mar.
    • Corsi C., Ferragina P., and Marangoni R. The BioPrompt-box: an ontology-based clustering tool for searching in biological databases. BMC Bioinform. S8 Suppl 1 (2007) 8 Mar.
    • (2007) BMC Bioinform. , vol.S8 , Issue.SUPPL. 1 , pp. 8
    • Corsi, C.1    Ferragina, P.2    Marangoni, R.3
  • 13
    • 2942634800 scopus 로고    scopus 로고
    • Using WordNet to complement training information in text categorization
    • Recent Advances in Natural Language Processing II: Selected Papers from RANLP'97, John Benjamins
    • De Buenaga M., Gmez J.M., and Daz B. Using WordNet to complement training information in text categorization. Recent Advances in Natural Language Processing II: Selected Papers from RANLP'97. Current Issues in Linguistic Theory (CILT) vol. 189 (2000), John Benjamins 353-364
    • (2000) Current Issues in Linguistic Theory (CILT) , vol.189 , pp. 353-364
    • De Buenaga, M.1    Gmez, J.M.2    Daz, B.3
  • 15
    • 64549137142 scopus 로고
    • Using latent semantic indexing for information filtering
    • Foltz P.W. Using latent semantic indexing for information filtering. SIGOIS Bull. 11 (1990) 2-3
    • (1990) SIGOIS Bull. , vol.11 , pp. 2-3
    • Foltz, P.W.1
  • 17
    • 0033321440 scopus 로고    scopus 로고
    • Building hypertext links by computing semantic similarity
    • Green S.J. Building hypertext links by computing semantic similarity. IEEE Trans. Knowl. Data Eng. 11 5 (1999) 713-730
    • (1999) IEEE Trans. Knowl. Data Eng. , vol.11 , Issue.5 , pp. 713-730
    • Green, S.J.1
  • 18
    • 84947729204 scopus 로고    scopus 로고
    • Information extraction: techniques and challenges
    • Proceedings of the Summer School on Information Extraction (SCIE-97). Pazienza M.T. (Ed), Springer-Verlag
    • Grishman R. Information extraction: techniques and challenges. In: Pazienza M.T. (Ed). Proceedings of the Summer School on Information Extraction (SCIE-97). LNCS/LNAI (1997), Springer-Verlag
    • (1997) LNCS/LNAI
    • Grishman, R.1
  • 19
    • 85132115014 scopus 로고    scopus 로고
    • Document similarity using a phrase indexing graph model
    • Hammouda K.M., and Kamel M.S. Document similarity using a phrase indexing graph model. Knowl. Inform. Syst. 6 6 (2004) 710-727
    • (2004) Knowl. Inform. Syst. , vol.6 , Issue.6 , pp. 710-727
    • Hammouda, K.M.1    Kamel, M.S.2
  • 20
    • 0012992939 scopus 로고    scopus 로고
    • Lexical chains as representations of context for the detection and correction of malapropisms
    • Fellbaum C. (Ed), The MIT Press, Cambridge, MA
    • Hirst G., and St-Onge D. Lexical chains as representations of context for the detection and correction of malapropisms. In: Fellbaum C. (Ed). WordNet: An Electronic Lexical Database (1998), The MIT Press, Cambridge, MA
    • (1998) WordNet: An Electronic Lexical Database
    • Hirst, G.1    St-Onge, D.2
  • 25
    • 2142808782 scopus 로고    scopus 로고
    • Hybrid neural document clustering using guided self-organization and WordNet
    • Hung C., Wermter S., and Smith P. Hybrid neural document clustering using guided self-organization and WordNet. IEEE Intell. Syst. 19 2 (2004) 68-77
    • (2004) IEEE Intell. Syst. , vol.19 , Issue.2 , pp. 68-77
    • Hung, C.1    Wermter, S.2    Smith, P.3
  • 26
    • 34547854956 scopus 로고    scopus 로고
    • Neural network based document clustering using WordNet ontologies
    • Hung C., and Wermter S. Neural network based document clustering using WordNet ontologies. Int. J. Hybrid Intell. Syst. 1 3, 4 (2004) 127-142
    • (2004) Int. J. Hybrid Intell. Syst. , vol.1 , Issue.3-4 , pp. 127-142
    • Hung, C.1    Wermter, S.2
  • 29
    • 13244298454 scopus 로고    scopus 로고
    • Exploiting concept clusters for content-based information retrieval
    • Kang B., Kim D., and Lee S. Exploiting concept clusters for content-based information retrieval. Inform. Sci. 170 2-4 (2005) 443-462
    • (2005) Inform. Sci. , vol.170 , Issue.2-4 , pp. 443-462
    • Kang, B.1    Kim, D.2    Lee, S.3
  • 32
    • 84898970714 scopus 로고    scopus 로고
    • Fast exact inference with a factored model for natural language parsing
    • Klein D., and Manning C.D. Fast exact inference with a factored model for natural language parsing. Adv. Neural Inform. Process. Syst. 15 (2003) 3-10
    • (2003) Adv. Neural Inform. Process. Syst. , vol.15 , pp. 3-10
    • Klein, D.1    Manning, C.D.2
  • 36
    • 52949106103 scopus 로고    scopus 로고
    • Clustering high dimensional data: a graph-based relaxed optimization approach
    • Lee C., Zai{dotless}̈ane O.R., Park H., Huang J., and Greiner R. Clustering high dimensional data: a graph-based relaxed optimization approach. Inform. Sci. 178 23 (2008) 4501-4511
    • (2008) Inform. Sci. , vol.178 , Issue.23 , pp. 4501-4511
    • Lee, C.1    Zaïane, O.R.2    Park, H.3    Huang, J.4    Greiner, R.5
  • 37
    • 64549103037 scopus 로고    scopus 로고
    • D.D. Lewis, Reuters-21578 Text Categorization Test Collection, Distribution 1.0, 1997.
    • D.D. Lewis, Reuters-21578 Text Categorization Test Collection, Distribution 1.0, 1997.
  • 38
    • 64549084615 scopus 로고    scopus 로고
    • D.D. Lewis, Readme File of Reuters-21578 Text Categorization Test Collection, Distribution 1.0, 1997.
    • D.D. Lewis, Readme File of Reuters-21578 Text Categorization Test Collection, Distribution 1.0, 1997.
  • 39
    • 84976702763 scopus 로고
    • WordNet: a lexical database for English
    • Miller G.A. WordNet: a lexical database for English. Commun. ACM 38 11 (1995) 39-41
    • (1995) Commun. ACM , vol.38 , Issue.11 , pp. 39-41
    • Miller, G.A.1
  • 40
    • 3142711025 scopus 로고    scopus 로고
    • Word sense disambiguation of WordNet glosses
    • Moldovan D., and Novischi A. Word sense disambiguation of WordNet glosses. Computer Speech and Language 18 3 (2004) 301-317
    • (2004) Computer Speech and Language , vol.18 , Issue.3 , pp. 301-317
    • Moldovan, D.1    Novischi, A.2
  • 42
    • 0001893260 scopus 로고
    • Lexical cohesion computed by thesaural relations as an indicator of the structure of text
    • Morris J., and Hirst G. Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Comput. Linguist. 17 1 (1991) 21C-43C
    • (1991) Comput. Linguist. , vol.17 , Issue.1
    • Morris, J.1    Hirst, G.2
  • 43
    • 51349113267 scopus 로고    scopus 로고
    • GAKREM: a novel hybrid clustering algorithm
    • Nguyen C.D., and Cios K.J. GAKREM: a novel hybrid clustering algorithm. Inform. Sci. 178 22 (2008) 4205-4227
    • (2008) Inform. Sci. , vol.178 , Issue.22 , pp. 4205-4227
    • Nguyen, C.D.1    Cios, K.J.2
  • 44
    • 20844452619 scopus 로고    scopus 로고
    • A concept-driven algorithm for clustering search results
    • Osinski S., and Weiss D. A concept-driven algorithm for clustering search results. IEEE Intell. Syst. 3 20 (2005) 48-54
    • (2005) IEEE Intell. Syst. , vol.3 , Issue.20 , pp. 48-54
    • Osinski, S.1    Weiss, D.2
  • 46
    • 0000582788 scopus 로고    scopus 로고
    • An algorithm for suffix stripping
    • Readings in Information Retrieval. Sparck Jones K., and Willett P. (Eds), Morgan Kaufmann Publishers, San Francisco, CA
    • Porter M.F. An algorithm for suffix stripping. In: Sparck Jones K., and Willett P. (Eds). Readings in Information Retrieval. Morgan Kaufmann Multimedia Information and Systems Series (1997), Morgan Kaufmann Publishers, San Francisco, CA 313-316
    • (1997) Morgan Kaufmann Multimedia Information and Systems Series , pp. 313-316
    • Porter, M.F.1
  • 49
    • 0842283371 scopus 로고    scopus 로고
    • A reference ontology for biomedical informatics: the foundational model of anatomy
    • Rosse C., and Mejino J.J. A reference ontology for biomedical informatics: the foundational model of anatomy. J. Biomed. Inform. 36 (2003) 478-500
    • (2003) J. Biomed. Inform. , vol.36 , pp. 478-500
    • Rosse, C.1    Mejino, J.J.2
  • 51
    • 0016572913 scopus 로고
    • A vector space model for automatic indexing
    • Salton G., Wong A., and Yang C.S. A vector space model for automatic indexing. Commun. ACM 18 11 (1975) 613-620
    • (1975) Commun. ACM , vol.18 , Issue.11 , pp. 613-620
    • Salton, G.1    Wong, A.2    Yang, C.S.3
  • 55
    • 3142656769 scopus 로고    scopus 로고
    • Unsupervised word sense disambiguation using WordNet relatives
    • Seo H., Chung H., Rim H., Myaeng S., and Kim S. Unsupervised word sense disambiguation using WordNet relatives. Computer Speech and Language 18 3 (2004) 253-273
    • (2004) Computer Speech and Language , vol.18 , Issue.3 , pp. 253-273
    • Seo, H.1    Chung, H.2    Rim, H.3    Myaeng, S.4    Kim, S.5
  • 56
    • 38149141821 scopus 로고    scopus 로고
    • M. Silberztein, An alternative approach to tagging, in: Proceedings of the 12th International Conference on Applications of Natural Language to Information Systems, NLDB 2007, pp. 1-11.
    • M. Silberztein, An alternative approach to tagging, in: Proceedings of the 12th International Conference on Applications of Natural Language to Information Systems, NLDB 2007, pp. 1-11.
  • 62
    • 10644269818 scopus 로고    scopus 로고
    • Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system
    • Torra V., Miyamoto S., and Lanau S. Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system. Inform. Process. Manag. 41 3 (2005) 587-598
    • (2005) Inform. Process. Manag. , vol.41 , Issue.3 , pp. 587-598
    • Torra, V.1    Miyamoto, S.2    Lanau, S.3
  • 65
    • 0142046683 scopus 로고    scopus 로고
    • Integrating linguistic resources in TC through WSD
    • Urena-Lopez L.A., Buenaga M., and Gomez J.M. Integrating linguistic resources in TC through WSD. Comput. Humanities 35 2 (2001) 215-230
    • (2001) Comput. Humanities , vol.35 , Issue.2 , pp. 215-230
    • Urena-Lopez, L.A.1    Buenaga, M.2    Gomez, J.M.3
  • 67
    • 34250782737 scopus 로고    scopus 로고
    • A novel document similarity measure based on earth mover's distance
    • Wan X. A novel document similarity measure based on earth mover's distance. Inform. Sci. 177 18 (2007) 3718-3730
    • (2007) Inform. Sci. , vol.177 , Issue.18 , pp. 3718-3730
    • Wan, X.1
  • 69
    • 3543147086 scopus 로고
    • Recent trends in hierarchical document clustering: a critical review
    • Willett P. Recent trends in hierarchical document clustering: a critical review. Inform. Process. Manag. 24 5 (1988) 577-597
    • (1988) Inform. Process. Manag. , vol.24 , Issue.5 , pp. 577-597
    • Willett, P.1
  • 70
    • 0032268443 scopus 로고    scopus 로고
    • O. Zamir, O. Etzioni, Web document clustering: a feasibility demonstration, in: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98), August 1998, pp. 46-54.
    • O. Zamir, O. Etzioni, Web document clustering: a feasibility demonstration, in: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98), August 1998, pp. 46-54.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.