메뉴 건너뛰기




Volumn 62, Issue 3, 2006, Pages 350-371

Automated subject classification of textual web documents

Author keywords

Automation; Classification; Controlled languages; Document management; Internet

Indexed keywords


EID: 33646369180     PISSN: 00220418     EISSN: None     Source Type: Journal    
DOI: 10.1108/00220410610666501     Document Type: Article
Times cited : (40)

References (118)
  • 1
    • 63349100334 scopus 로고    scopus 로고
    • 20 Newsgroups DataSet (accessed 22 December 2004)
    • 20 Newsgroups DataSet (1998), The 4 Universities Data Set, available at: www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html (accessed 22 December 2004).
    • (1998) The 4 Universities Data Set
  • 2
    • 33646340641 scopus 로고    scopus 로고
    • Dewey Services, available at: www.oclc.org/dewey/about/research/ (accessed 8 August 2005)
    • DDC (2005), "About DDC: research: a vital part of ongoing development", Dewey Services, available at: www.oclc.org/dewey/about/ research/ (accessed 8 August 2005).
    • (2005) About DDC: Research: A Vital Part of Ongoing Development
  • 3
    • 33646351243 scopus 로고
    • Improving resource discovery and retrieval on the internet: The Nordic WAIS/world wide web project summary report
    • Ardö, A. et al., (1994), "Improving resource discovery and retrieval on the internet: the Nordic WAIS/world wide web project summary report", NORDINFO Nytt, Vol. 17 No. 4, pp. 13-28.
    • (1994) NORDINFO Nytt , vol.17 , Issue.4 , pp. 13-28
    • Ardö, A.1
  • 5
    • 77951430107 scopus 로고    scopus 로고
    • Distributional word clusters vs words for text categorization
    • Bekkerman, R. et al., (2003), "Distributional word clusters vs words for text categorization", Journal of Machine Learning Research, Vol. 3, pp. 1183-208.
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 1183-208
    • Bekkerman, R.1
  • 6
    • 33646367618 scopus 로고    scopus 로고
    • HLTCentral, available at: www.hltcentral.org/projects/print.php?acronym= BINDEX (accessed 22 December 2004)
    • BINDEX (2001), "HLT Project Factsheet: BINDEX", HLTCentral, available at: www.hltcentral.org/projects/print.php?acronym=BINDEX (accessed 22 December 2004).
    • (2001) HLT Project Factsheet: BINDEX
  • 9
    • 33646357863 scopus 로고    scopus 로고
    • CERES thesaurus effort
    • available at: http://ceres.ca.gov/thesaurus/ (accessed 22 December 2004)
    • CERES (2003), "CERES thesaurus effort", CERES The California Environmental Resources Evaluation System, available at: http://ceres.ca.gov/ thesaurus/ (accessed 22 December 2004).
    • (2003) CERES The California Environmental Resources Evaluation System
  • 11
    • 0000776545 scopus 로고    scopus 로고
    • Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies
    • Chakrabarti, S., Dom, B. and Indyk, P. (1998b), "Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies", Journal of Very Large Data Bases, Vol. 7 No. 3, pp. 163-78.
    • (1998) Journal of Very Large Data Bases , vol.7 , Issue.3 , pp. 163-78
    • Chakrabarti, S.1    Dom, B.2    Indyk, P.3
  • 15
    • 33646351005 scopus 로고    scopus 로고
    • Vivsimo, available at: www.clusty.com (accessed 22 December 2004)
    • Clusty (2004), "Clusty the clustering engine", Vivsimo, available at: www.clusty.com (accessed 22 December 2004).
    • (2004) Clusty the Clustering Engine
  • 18
    • 33646361534 scopus 로고    scopus 로고
    • Lunds Universitets Bibliotek, available at: www.lub.lu.se/desire (accessed 22 December 2004)
    • DESIRE Project (1999), Lunds Universitets Bibliotek, available at: www.lub.lu.se/desire (accessed 22 December 2004).
    • (1999)
    • Project, D.1
  • 21
    • 33646381839 scopus 로고    scopus 로고
    • Report on the workshop on operational text classification systems (OTC-02)
    • Dumais, S.T., Lewis, D.D. and Sebastiani, F. (2002), "Report on the workshop on operational text classification systems (OTC-02)", ACM SIGIR Forum, Vol. 35 No. 2, pp. 8-11.
    • (2002) ACM SIGIR Forum , vol.35 , Issue.2 , pp. 8-11
    • Dumais, S.T.1    Lewis, D.D.2    Sebastiani, F.3
  • 22
    • 33645980702 scopus 로고    scopus 로고
    • EELS, Engineering E-Library, Sweden, available at: http://eels.lub.lu.se/ ae/ (accessed 22 December 2004)
    • EELS (2003), "'All' Engineering resources on the internet: a companion service to EELS", EELS, Engineering E-Library, Sweden, available at: http://eels.lub.lu.se/ae/ (accessed 22 December 2004).
    • (2003) 'All' Engineering Resources on the Internet: A Companion Service to EELS
  • 23
    • 33646344924 scopus 로고    scopus 로고
    • Lund University Libraries, available at: http://engine-e.lub.lu.se/ (accessed 22 December)
    • Engine-e (2004), Lund University Libraries, available at: http://engine-e.lub.lu.se/ (accessed 22 December).
    • (2004)
  • 24
    • 33646366363 scopus 로고    scopus 로고
    • Lund University Libraries, available at: http://eels.lub.lu.se/ (accessed 22 December 2004)
    • Engineering Electronic Library (2003), Lund University Libraries, available at: http://eels.lub.lu.se/ (accessed 22 December 2004).
    • (2003)
  • 25
    • 33646350475 scopus 로고    scopus 로고
    • OCLC projects, available at: www.oclc.org/research/projects/fastac/ (accessed 7 August 2005)
    • FAST (2003), "FAST as a knowledge base for automated classification", OCLC projects, available at: www.oclc.org/research/ projects/fastac/ (accessed 7 August 2005).
    • (2003) FAST as a Knowledge Base for Automated Classification
  • 26
    • 33646384111 scopus 로고    scopus 로고
    • OCLC projects, available at: www.oclc.org/research/projects/fast/ (accessed 22 December 2004)
    • FAST (2004), "FAST: faceted application of subject terminology", OCLC projects, available at: www.oclc.org/research/projects/ fast/ (accessed 22 December 2004).
    • (2004) FAST: Faceted Application of Subject Terminology
  • 27
    • 0004140078 scopus 로고    scopus 로고
    • University of Washington, available at: http://citeseer.nj.nec.com/ fasulo99analysi.html (accessed 22 December 2004)
    • Fasulo, D. (1999), "An analysis of recent work on clustering algorithms: technical report", University of Washington, available at: http://citeseer.nj.nec.com/fasulo99analysi.html (accessed 22 December 2004).
    • (1999) An Analysis of Recent Work on Clustering Algorithms: Technical Report
    • Fasulo, D.1
  • 31
    • 0036895475 scopus 로고    scopus 로고
    • Hyperlink ensembles: A case study in hypertext classification
    • Fürnkranz, J. (2002), "Hyperlink ensembles: a case study in hypertext classification", Information Fusion, Vol. 3 No. 4, pp. 299-312.
    • (2002) Information Fusion , vol.3 , Issue.4 , pp. 299-312
    • Fürnkranz, J.1
  • 32
    • 33645019702 scopus 로고
    • A system for automatic classification of scientific literature
    • (Reprinted in: Essays of an Information Scientist, Vol. 2, pp. 356-65)
    • Garfield, E., Malin, M.V. and Small, H. (1975), "A system for automatic classification of scientific literature", Journal of the Indian Institute of Science, Vol. 57 No. 2, pp. 61-74, (Reprinted in: Essays of an Information Scientist, Vol. 2, pp. 356-65).
    • (1975) Journal of the Indian Institute of Science , vol.57 , Issue.2 , pp. 61-74
    • Garfield, E.1    Malin, M.V.2    Small, H.3
  • 33
    • 33646362995 scopus 로고    scopus 로고
    • GERHARD, available at: www.gerhard.de/ (accessed 22 December 2004)
    • GERHARD (1998), "GERHARD: German harvest automated retrieval and directory", GERHARD, available at: www.gerhard.de/ (accessed 22 December 2004).
    • (1998) GERHARD: German Harvest Automated Retrieval and Directory
  • 34
    • 33646340116 scopus 로고    scopus 로고
    • GERHARD, available at: www.gerhard.de/info/dokumente/vortraege/ecdl99/ html/index.htm (accessed 22 December 2004)
    • GERHARD (1999), "GERHARD - navigating the web with the universal decimal classification system", GERHARD, available at: www.gerhard.de/info/ dokumente/vortraege/ecdl99/html/index.htm (accessed 22 December 2004).
    • (1999) GERHARD - Navigating the Web with the Universal Decimal Classification System
  • 38
    • 33646358183 scopus 로고    scopus 로고
    • OCLC Digital Archive, available at: http://digitalarchive.oclc.org/da/ ViewObject.jsp?fileid=0000003487: 000000090408&reqid=33836 (accessed 22 December 2004)
    • Godby, J. and Reighart, R. (1998), "The WordSmith indexing system", OCLC Digital Archive, available at: http://digitalarchive.oclc. org/da/ViewObject.jsp?fileid=0000003487: 000000090408&reqid=33836 (accessed 22 December 2004).
    • (1998) The WordSmith Indexing System
    • Godby, J.1    Reighart, R.2
  • 42
    • 33646381311 scopus 로고    scopus 로고
    • Introduction
    • Hubert, L. De Soete, G. World Scientific Singapore
    • Hartigan, J.A. (1996), "Introduction", in Hubert, L. and De Soete, G. (Eds), Clustering and Classification Arabie, World Scientific, Singapore.
    • (1996) Clustering and Classification Arabie
    • Hartigan, J.A.1
  • 47
    • 84860961515 scopus 로고    scopus 로고
    • accessed 22 December 2004
    • INitiative for the Evaluation of XML Retrieval (2004), DELOS Network of Excellence for Digital Libraries, available at: http://inex.is.informatik.uni- duisburg.de/ (accessed 22 December 2004).
    • (2004) DELOS Network of Excellence for Digital Libraries
  • 49
    • 0000645505 scopus 로고    scopus 로고
    • Automatic classification of web resources using Java and Dewey decimal classification
    • Jenkins, C. et al., (1998), "Automatic classification of web resources using Java and Dewey decimal classification", Computer Networks & ISDN Systems, Vol. 30, pp. 646-8.
    • (1998) Computer Networks & ISDN Systems , vol.30 , pp. 646-8
    • Jenkins, C.1
  • 52
    • 33646345312 scopus 로고    scopus 로고
    • DESIRE II D3.6a, Overview of Results, available at: www.lub.lu.se/desire/ DESIRE36a-overview.html (accessed 22 December 2004)
    • Koch, T. and Ardö, A. (2000), "Automatic classification", DESIRE II D3.6a, Overview of Results, available at: www.lub.lu.se/desire/ DESIRE36a-overview.html (accessed 22 December 2004).
    • (2000) Automatic Classification
    • Koch, T.1    Ardö, A.2
  • 58
    • 84989528918 scopus 로고
    • Experiments in automatic library of congress classification
    • Larson, R.R. (1992), "Experiments in automatic library of congress classification", Journal of the American Society for Information Science, Vol. 43 No. 2, pp. 130-48.
    • (1992) Journal of the American Society for Information Science , vol.43 , Issue.2 , pp. 130-48
    • Larson, R.R.1
  • 59
    • 26944485923 scopus 로고    scopus 로고
    • Classification of text documents
    • Li, Y.H. and Jain, A.K. (1998), "Classification of text documents", The Computer Journal, Vol. 41 No. 8, pp. 537-46.
    • (1998) The Computer Journal , vol.41 , Issue.8 , pp. 537-46
    • Li, Y.H.1    Jain, A.K.2
  • 61
    • 33646353892 scopus 로고    scopus 로고
    • Experiences of harvesting web resources in engineering using automatic classification
    • available at: www.ariadne.ac.uk/issue37/lindholm/
    • Lindholm, J., Schönthal, T. and Jansson, K. (2003), "Experiences of harvesting web resources in engineering using automatic classification", Ariadne, No. 37, available at: www.ariadne.ac.uk/issue37/ lindholm/.
    • (2003) Ariadne , Issue.37
    • Lindholm, J.1    Schönthal, T.2    Jansson, K.3
  • 63
    • 0002332781 scopus 로고    scopus 로고
    • Improving text classification by shrinkage in a hierarchy of classes
    • paper presented at
    • McCallum, A. et al. (1998), "Improving text classification by shrinkage in a hierarchy of classes", paper presented at ICML-98, 15th International Conference on Machine Learning, pp. 359-67.
    • (1998) ICML-98, 15th International Conference on Machine Learning , pp. 359-67
    • McCallum, A.1
  • 64
    • 0038548015 scopus 로고    scopus 로고
    • Building domain-specific search engines with machine learning techniques
    • paper presented at
    • McCallum, A. et al. (1999), "Building domain-specific search engines with machine learning techniques", paper presented at AAAI-99 Spring Symposium on Intelligent Agents in Cyberspace.
    • (1999) AAAI-99 Spring Symposium on Intelligent Agents in Cyberspace
    • McCallum, A.1
  • 65
    • 0000806922 scopus 로고    scopus 로고
    • Automating the construction of internet portals with machine learning
    • McCallum, A. et al., (2000), "Automating the construction of internet portals with machine learning", Information Retrieval Journal, Vol. 3, pp. 127-63.
    • (2000) Information Retrieval Journal , vol.3 , pp. 127-63
    • McCallum, A.1
  • 69
    • 33646369112 scopus 로고    scopus 로고
    • available at: http://metacrawler.com (accessed 5 August 2005)
    • MetaCrawler Web Search (2005), available at: http://metacrawler.com (accessed 5 August 2005).
    • (2005)
  • 72
    • 0037375142 scopus 로고    scopus 로고
    • Feature selection on hierarchy of web documents
    • Mladenic, D. and Grobelnik, M. (2003), "Feature selection on hierarchy of web documents", Decision Support Systems, Vol. 35 No. 1, pp. 45-87.
    • (2003) Decision Support Systems , vol.35 , Issue.1 , pp. 45-87
    • Mladenic, D.1    Grobelnik, M.2
  • 74
    • 33646378111 scopus 로고
    • Lund University Libraries accessed 22 December 2004
    • Nordic WAIS/World Wide Web Project (1995), Lund University Libraries, available at: www.lub.lu.se/W4/ (accessed 22 December 2004).
    • (1995)
  • 75
    • 33646365312 scopus 로고    scopus 로고
    • Bilingual indexing for information retrieval with AUTINDEX
    • Nübel, R. et al. (2002), "Bilingual indexing for information retrieval with AUTINDEX", LREC Proceedings, Las Palmas.
    • (2002) LREC Proceedings, Las Palmas
    • Nübel, R.1
  • 79
    • 0000699588 scopus 로고    scopus 로고
    • A spatial user interface to the astronomical literature
    • 2 May
    • Poincot, P., Lesteven, P.S. and Murtagh, F. (1998), "A spatial user interface to the astronomical literature", Astronomy & Astrophysics, 2 May, pp. 183-91.
    • (1998) Astronomy & Astrophysics , pp. 183-91
    • Poincot, P.1    Lesteven, P.S.2    Murtagh, F.3
  • 81
    • 33646352675 scopus 로고
    • Clustering algoritms
    • Frakes, W.B. Baeza-Yates, R. Prentice-Hall Engelwood Cliffs, NJ
    • Rasmussen, E. (1992), "Clustering algoritms", in Frakes, W.B. and Baeza-Yates, R. (Eds), Information Retrieval: Data Structures and Algorithms, Prentice-Hall, Engelwood Cliffs, NJ.
    • (1992) Information Retrieval: Data Structures and Algorithms
    • Rasmussen, E.1
  • 83
    • 33646351004 scopus 로고    scopus 로고
    • Reuters-21578, available at: www.daviddlewis.com/resources/ testcollections/reuters21578/ (accessed 3 August 2005)
    • Reuters-21578 (2004), available at: www.daviddlewis.com/resources/ testcollections/reuters21578/ (accessed 3 August 2005).
    • (2004)
  • 87
    • 0000417994 scopus 로고
    • Developments in automatic text retrieval
    • Salton, G. (1991), "Developments in automatic text retrieval", Science, Vol. 253, pp. 974-9.
    • (1991) Science , vol.253 , pp. 974-9
    • Salton, G.1
  • 90
    • 33749351398 scopus 로고    scopus 로고
    • Automatic text representation, classification and labeling in European law
    • Schweighofer, E., Rauber, A. and Dittenbach, M. (2001), "Automatic text representation, classification and labeling in European law", ICAIL 2001, pp. 78-87.
    • (2001) ICAIL 2001 , pp. 78-87
    • Schweighofer, E.1    Rauber, A.2    Dittenbach, M.3
  • 91
    • 33646376172 scopus 로고    scopus 로고
    • OCLC software, available at: www.oclc.org/research/software/scorpion/ default.htm (accessed 22 December)
    • Scorpion (2004), OCLC software, available at: www.oclc.org/research/ software/scorpion/default.htm (accessed 22 December).
    • (2004)
  • 92
    • 0002442796 scopus 로고    scopus 로고
    • Machine learning in automated text categorization
    • Sebastiani, F. (2002), "Machine learning in automated text categorization", ACM Computing Surveys, Vol. 34 No. 1, pp. 1-47.
    • (2002) ACM Computing Surveys , vol.34 , Issue.1 , pp. 1-47
    • Sebastiani, F.1
  • 95
    • 2942755807 scopus 로고    scopus 로고
    • Reengineering thesauri for new applications: The AGROVOC example
    • Article No. 257, available at: http://jodi.ecs.soton.ac.uk/Articles/v04/ i04/Soergel/
    • Soergel, D. et al., (2004), "Reengineering thesauri for new applications: the AGROVOC example", Journal of Digital Information, Vol. 4 No. 4, Article No. 257, available at: http://jodi.ecs.soton.ac.uk/Articles/v04/ i04/Soergel/.
    • (2004) Journal of Digital Information , vol.4 , Issue.4
    • Soergel, D.1
  • 98
    • 33746857882 scopus 로고    scopus 로고
    • OCLC Publications, available at: http://digitalarchive.oclc.org/da/ ViewObject.jsp?objid=0000003409 (accessed 22 December 2004)
    • Subramanian, S. and Shafer, K.E. (1998), "Clustering", OCLC Publications, available at: http://digitalarchive.oclc.org/da/ViewObject.jsp? objid=0000003409 (accessed 22 December 2004).
    • (1998) Clustering
    • Subramanian, S.1    Shafer, K.E.2
  • 101
    • 33646368097 scopus 로고    scopus 로고
    • available at: http://search.thunderstone.com/texis/websearch (accessed 4 August 2005)
    • Thunderstone (2005), Thunderstone's Web Site Catalog, available at: http://search.thunderstone.com/texis/websearch (accessed 4 August 2005).
    • (2005) Thunderstone's Web Site Catalog
  • 103
    • 2442471873 scopus 로고    scopus 로고
    • Innovative solutions in automatic classification: A brief summary
    • Toth, E. (2002), "Innovative solutions in automatic classification: a brief summary", Libri, Vol. 25 No. 1, pp. 48-53.
    • (2002) Libri , vol.25 , Issue.1 , pp. 48-53
    • Toth, E.1
  • 104
    • 33646341179 scopus 로고    scopus 로고
    • National Institute of Standards and Technology, available at: http://trec.nist.gov/ (accessed 22 December 2004)
    • TREC (2004), "TREC: Text REtrieval Conference", National Institute of Standards and Technology, available at: http://trec.nist.gov/ (accessed 22 December 2004).
    • (2004) TREC: Text REtrieval Conference
  • 105
    • 33646356174 scopus 로고    scopus 로고
    • Using library classification schemes for internet resources
    • available at: http://webdoc.sub.gwdg.de/ebook/aw/oclc/man/colloq/v-g.htm, (accessed 4 April 2006)
    • Vizine-Goetz, D. (1996), "Using library classification schemes for internet resources", OCLC Internet Cataloging Project Colloquium, available at: http://webdoc.sub.gwdg.de/ebook/aw/oclc/man/colloq/v-g.htm, (accessed 4 April 2006).
    • (1996) OCLC Internet Cataloging Project Colloquium
    • Vizine-Goetz, D.1
  • 107
    • 20444365424 scopus 로고
    • University of Wolverhampton, Wolverhampton, available at: www.scit.wlv.ac.uk/wwlib/position.html (accessed 22 December 2004)
    • Wallis, J. and Burden, P. (1995), "Towards a classification-based approach to resource discovery on the web", University of Wolverhampton, Wolverhampton, available at: www.scit.wlv.ac.uk/wwlib/position.html (accessed 22 December 2004).
    • (1995) Towards a classification-based approach to resource discovery on the web
    • Wallis, J.1    Burden, P.2
  • 109
    • 33646351505 scopus 로고    scopus 로고
    • CMU World Wide Knowledge Base, available at: www-2.cs.cmu.edu/ ∼ webkb/ (accessed 22 December 2004)
    • WebKB (2001), CMU World Wide Knowledge Base, available at: www-2.cs.cmu.edu/ ∼ webkb/ (accessed 22 December 2004).
    • (2001)
  • 111
    • 3543147086 scopus 로고
    • Recent trends in hierarchic document clustering: A critical review
    • Willet, P. (1988), "Recent trends in hierarchic document clustering: a critical review", Information Processing and Management, Vol. 24 No. 5, pp. 577-97.
    • (1988) Information Processing and Management , vol.24 , Issue.5 , pp. 577-97
    • Willet, P.1
  • 112
    • 33646369345 scopus 로고    scopus 로고
    • available at: http://dir.yahoo.com/ (accessed 8 August 2005)
    • Yahoo (2005), Yahoo Directory, available at: http://dir.yahoo.com/ (accessed 8 August 2005).
    • (2005) Yahoo Directory
  • 113
    • 27144441097 scopus 로고    scopus 로고
    • An evaluation of statistical approaches to text categorization
    • Yang, Y. (1999), "An evaluation of statistical approaches to text categorization", Journal of Information Retrieval, Vol. 1 Nos 1/2, pp. 67-88.
    • (1999) Journal of Information Retrieval , vol.1 , Issue.12 , pp. 67-88
    • Yang, Y.1
  • 114
    • 0037375043 scopus 로고    scopus 로고
    • Visualization of large category map for internet browsing
    • Yang, C., Chen, H. and Hong, K. (2003), "Visualization of large category map for internet browsing", Decision Support Systems (DSS), Vol. 35 No. 1, pp. 89-102.
    • (2003) Decision Support Systems (DSS) , vol.35 , Issue.1 , pp. 89-102
    • Yang, C.1    Chen, H.2    Hong, K.3
  • 116
    • 0032268443 scopus 로고    scopus 로고
    • Web document clustering: A feasibility demonstration
    • Zamir, O. and Etzioni, O. (1998), "Web document clustering: a feasibility demonstration", ACM SIGIR'98, Australia, pp. 46-54.
    • (1998) ACM SIGIR'98, Australia , pp. 46-54
    • Zamir, O.1    Etzioni, O.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.