메뉴 건너뛰기




Volumn 158, Issue 1-4, 2004, Pages 89-115

Effect of term distributions on centroid-based text categorization

Author keywords

Centroid based classifier; Term distribution; Text categorization

Indexed keywords

CLASSIFICATION (OF INFORMATION); COMPUTATIONAL METHODS; PROBABILITY DISTRIBUTIONS; REGRESSION ANALYSIS; TEXT PROCESSING;

EID: 0242692521     PISSN: 00200255     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ins.2003.07.007     Document Type: Article
Times cited : (67)

References (35)
  • 1
    • 0033886806 scopus 로고    scopus 로고
    • Text classification from labeled and unlabeled documents using EM
    • Nigam K., McCallum A.K., Thrun S., Mitchell T.M. Text classification from labeled and unlabeled documents using EM. Machine Learning. 39(2/3):2000;103-134 http://www.cs.cmu.edu/knigam/papers/emcat-mlj99.ps.
    • (2000) Machine Learning , vol.39 , Issue.2-3 , pp. 103-134
    • Nigam, K.1    Mccallum, A.K.2    Thrun, S.3    Mitchell, T.M.4
  • 2
    • 27144441097 scopus 로고    scopus 로고
    • An evaluation of statistical approaches to text categorization
    • Yang Y. An evaluation of statistical approaches to text categorization. Information Retrieval. 1(1/2):1999;69-90 http://www.cs.cmu.edu/yiming/papers.yy/irj99.ps.
    • (1999) Information Retrieval , vol.1 , Issue.1-2 , pp. 69-90
    • Yang, Y.1
  • 3
    • 0028461417 scopus 로고
    • Automated learning of decision rules for text categorization
    • Apté C.d., Damerau F.J., Weiss S.M. Automated learning of decision rules for text categorization. ACM Transactions on Information Systems. 12(3):1994;233-251 http://www.acm.org/pubs/articles/journals/tois/1994-12-3/p233-apte/p233-apte. pdf.
    • (1994) ACM Transactions on Information Systems , vol.12 , Issue.3 , pp. 233-251
    • Apté, C.D.1    Damerau, F.J.2    Weiss, S.M.3
  • 4
    • 0028461554 scopus 로고
    • An example-based mapping method for text categorization and retrieval
    • Yang Y., Chute C.G. An example-based mapping method for text categorization and retrieval. ACM Transactions on Information Systems. 12(3):1994;252-277 http://www.acm.org/pubs/articles/journals/tois/1994-12-3/p252-yang/p252-yang. pdf.
    • (1994) ACM Transactions on Information Systems , vol.12 , Issue.3 , pp. 252-277
    • Yang, Y.1    Chute, C.G.2
  • 6
    • 0032281304 scopus 로고    scopus 로고
    • Automatic essay grading using text categorization techniques
    • W.B. Croft, A. Moffat, C.J. van Rijsbergen, R. Wilkinson, & J. Zobel. New York, US, Melbourne, Australia: ACM Press
    • Larkey L.S. Automatic essay grading using text categorization techniques. Croft W.B., Moffat A., van Rijsbergen C.J., Wilkinson R., Zobel J. Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval. 1998;90-95 ACM Press, New York, US, Melbourne, Australia, http://cobar.cs.umass.edu/pubfiles/ir-121.ps.
    • (1998) Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval , pp. 90-95
    • Larkey, L.S.1
  • 7
    • 0012657799 scopus 로고
    • Prototype and feature selection by sampling and random mutation hill climbing algorithms
    • Available from 〈citeseer.nj.nec.com/skalak94prototype.html〉
    • D.B. Skalak, Prototype and feature selection by sampling and random mutation hill climbing algorithms, in: International Conference on Machine Learning, 1994, pp. 293-301. Available from 〈citeseer.nj.nec.com/skalak94prototype.html〉.
    • (1994) International Conference on Machine Learning , pp. 293-301
    • Skalak, D.B.1
  • 8
    • 84949950595 scopus 로고    scopus 로고
    • Text categorization using weight-adjusted k -nearest neighbor classification
    • D. Cheung, Q. Li, & G. Williams. Heidelberg, Germany, Hong Kong, China: Springer Verlag. published in the "Lecture Notes in Computer Science" series, number 2035
    • Han E.-H., Karypis G., Kumar V. Text categorization using weight-adjusted. k -nearest neighbor classification Cheung D., Li Q., Williams G. Proceedings of PAKDD-01, 5th Pacific-Asia Conferenece on Knowledge Discovery and Data Mining. 2001;53-65 Springer Verlag, Heidelberg, Germany, Hong Kong, China. published in the "Lecture Notes in Computer Science" series, number 2035 http://link.springer.de/link/service/series/0558/papers/2035/20350053.pdf.
    • (2001) Proceedings of PAKDD-01, 5th Pacific-Asia Conferenece on Knowledge Discovery and Data Mining , pp. 53-65
    • Han, E.-H.1    Karypis, G.2    Kumar, V.3
  • 9
    • 34447620746 scopus 로고
    • Improving text retrieval for the routing problem using latent semantic indexing
    • W.B. Croft, & C.J. van Rijsbergen. Heidelberg, Germany, Dublin, Ireland: Springer Verlag
    • Hull D.A. Improving text retrieval for the routing problem using latent semantic indexing. Croft W.B., van Rijsbergen C.J. Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval. 1994;282-289 Springer Verlag, Heidelberg, Germany, Dublin, Ireland, http://www.acm.org/pubs/articles/proceedings/ir/188490/p282-hull/p282-hull.pdf.
    • (1994) Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval , pp. 282-289
    • Hull, D.A.1
  • 11
    • 0002409860 scopus 로고    scopus 로고
    • A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization
    • D.H. Fisher. San Francisco, US, Nashville, US: Morgan Kaufmann Publishers, Available from 〈citeseer.nj.nec.com/joachims96probabilistic.html〉
    • Joachims T. A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. Fisher D.H. Proceedings of ICML-97, 14th International Conference on Machine Learning. 1997;143-151 Morgan Kaufmann Publishers, San Francisco, US, Nashville, US, citeseer.nj.nec.com/joachims96probabilistic.html.
    • (1997) Proceedings of ICML-97, 14th International Conference on Machine Learning , pp. 143-151
    • Joachims, T.1
  • 12
    • 84947608587 scopus 로고    scopus 로고
    • A fast algorithm for hierarchical text classification
    • Y. Kambayashi, M. Mohania, & A. Tjoa. Heidelberg, Germany, London, UK: Springer Verlag. published in the "Lecture Notes in Computer Science" series, number 1874
    • Chuang W.T., Tiyyagura A., Yang J., Giuffrida G. A fast algorithm for hierarchical text classification. Kambayashi Y., Mohania M., Tjoa A. Proceedings of DaWaK-00, 2nd International Conference on Data Warehousing and Knowledge Discovery. 2000;409-418 Springer Verlag, Heidelberg, Germany, London, UK. published in the "Lecture Notes in Computer Science" series, number 1874 http://www.cs.iastate.edu/yang/Papers/dawak00.ps.
    • (2000) Proceedings of DaWaK-00, 2nd International Conference on Data Warehousing and Knowledge Discovery , pp. 409-418
    • Chuang, W.T.1    Tiyyagura, A.2    Yang, J.3    Giuffrida, G.4
  • 14
    • 84957069814 scopus 로고    scopus 로고
    • Text categorization with support vector machines: Learning with many relevant features
    • C. Nédellec, & C. Rouveirol. Heidelberg, Germany, Chemnitz, Germany: Springer Verlag, Available from 〈citeseer.nj.nec.com/joachims98text.html〉
    • Joachims T. Text categorization with support vector machines: learning with many relevant features. Nédellec C., Rouveirol C. Proceedings of ECML-98, 10th European Conference on Machine Learning, no. 1398. 1998;137-142 Springer Verlag, Heidelberg, Germany, Chemnitz, Germany, citeseer.nj.nec.com/joachims98text.html.
    • (1998) Proceedings of ECML-98, 10th European Conference on Machine Learning , vol.1398 , pp. 137-142
    • Joachims, T.1
  • 15
    • 0030651099 scopus 로고    scopus 로고
    • Feature selection, perceptron learning, and a usability case study for text categorization
    • N.J. Belkin, A.D. Narasimhalu, & P. Willett. New York, US, Philadelphia, US: ACM Press
    • Ng H.T., Goh W.B., Low K.L. Feature selection, perceptron learning, and a usability case study for text categorization. Belkin N.J., Narasimhalu A.D., Willett P. Proceedings of SIGIR-97, 20th ACM International Conference on Research and Development in Information Retrieval. 1997;67-73 ACM Press, New York, US, Philadelphia, US, http://www.acm.org/pubs/articles/proceedings/ir/258525/p67-ng/p67-ng.pdf.
    • (1997) Proceedings of SIGIR-97, 20th ACM International Conference on Research and Development in Information Retrieval , pp. 67-73
    • Ng, H.T.1    Goh, W.B.2    Low, K.L.3
  • 16
    • 84962671851 scopus 로고    scopus 로고
    • Centroid-based document classification: Analysis and experimental results
    • Available from 〈citeseer.nj.nec.com/han00centroidbased.html〉
    • E.-H. Han, G. Karypis, Centroid-based document classification: analysis and experimental results, in: Principles of Data Mining and Knowledge Discovery, 2000, pp. 424-431. Available from〈citeseer.nj.nec.com/han00centroidbased.html〉.
    • (2000) Principles of Data Mining and Knowledge Discovery , pp. 424-431
    • Han, E.-H.1    Karypis, G.2
  • 19
    • 0032258599 scopus 로고    scopus 로고
    • Boosting and Rocchio applied to text filtering
    • W.B. Croft, A. Moffat, C.J. van Rijsbergen, R. Wilkinson, & J. Zobel. New York, US, Melbourne, Australia: ACM Press
    • Schapire R.E., Singer Y., Singhal A. Boosting and Rocchio applied to text filtering. Croft W.B., Moffat A., van Rijsbergen C.J., Wilkinson R., Zobel J. Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval. 1998;215-223 ACM Press, New York, US, Melbourne, Australia, http://www.research.att.com/schapire/cgi-bin/uncompress-papers/SchapireSiSi98. ps.
    • (1998) Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval , pp. 215-223
    • Schapire, R.E.1    Singer, Y.2    Singhal, A.3
  • 20
    • 0002428766 scopus 로고
    • Learning to classify English text with ILP methods
    • L. De Raedt. Amsterdam, The Netherlands: IOS Press
    • Cohen W.W. Learning to classify English text with ILP methods. De Raedt L. Advances in Inductive Logic Programming. 1995;124-143 IOS Press, Amsterdam, The Netherlands, http://www.research.whizbang.com/wcohen/postscript/ilp.ps.
    • (1995) Advances in Inductive Logic Programming , pp. 124-143
    • Cohen, W.W.1
  • 22
    • 45549117987 scopus 로고
    • Term-weighting approaches in automatic text retrieval
    • Salton G., Buckley C. Term-weighting approaches in automatic text retrieval. Information Processing and Management. 24(5):1988;513-523.
    • (1988) Information Processing and Management , vol.24 , Issue.5 , pp. 513-523
    • Salton, G.1    Buckley, C.2
  • 23
    • 0003881588 scopus 로고
    • Length normalization in degraded text collections
    • Available from 〈citeseer.nj.nec.com/singhal95length.html〉
    • A. Singhal, G. Salton, C. Buckley, Length normalization in degraded text collections, Tech. Rep. TR95-1507, 1995. Available from 〈citeseer.nj.nec.com/singhal95length.html〉.
    • (1995) Tech. Rep. , vol.TR95-1507
    • Singhal, A.1    Salton, G.2    Buckley, C.3
  • 24
    • 0030402534 scopus 로고    scopus 로고
    • Pivoted document length normalization
    • Available from 〈citeseer.nj.nec.com/singhal96pivoted.html〉
    • A. Singhal, C. Buckley, M. Mitra, Pivoted document length normalization, in: Research and Development in Information Retrieval, 1996, pp. 21-29. Available from 〈citeseer.nj.nec.com/singhal96pivoted.html〉.
    • (1996) Research and Development in Information Retrieval , pp. 21-29
    • Singhal, A.1    Buckley, C.2    Mitra, M.3
  • 26
    • 0033699174 scopus 로고    scopus 로고
    • Support vector machines based on a semantic kernel for text categorization
    • S.-I. Amari, C.L. Giles, M. Gori, & V. Piuri. Los Alamitos, US, Como, Italy: IEEE Computer Society Press
    • Siolas G., d'Alche Buc F. Support vector machines based on a semantic kernel for text categorization. Amari S.-I., Giles C.L., Gori M., Piuri V. Proceedings of IJCNN-00, 11th International Joint Conference on Neural Networks. vol. 5:2000;205-209 IEEE Computer Society Press, Los Alamitos, US, Como, Italy, http://dlib.computer.org/conferen/ijcnn/0619/pdf/06193581.pdf.
    • (2000) Proceedings of IJCNN-00, 11th International Joint Conference on Neural Networks , vol.5 , pp. 205-209
    • Siolas, G.1    D'Alche Buc, F.2
  • 27
    • 84947929366 scopus 로고    scopus 로고
    • A comparative study on statistical machine learning algorithms and thresholding strategies for automatic text categorization
    • M. Ishizuka, & A. Sattar. Heidelberg, Germany, Tokyo, Japan: Springer Verlag. published in the "Lecture Notes in Computer Science" series, number 2417
    • Lee K.H., Kay J., Kang B.H., Rosebrock U. A comparative study on statistical machine learning algorithms and thresholding strategies for automatic text categorization. Ishizuka M., Sattar A. Proceedings of PRICAI-02, 7th Pacific Rim International Conference on Artificial Intelligence. 2002;444-453 Springer Verlag, Heidelberg, Germany, Tokyo, Japan. published in the "Lecture Notes in Computer Science" series, number 2417 http://link.springer.de/link/service/series/0558/papers/2417/24170444.pdf.
    • (2002) Proceedings of PRICAI-02, 7th Pacific Rim International Conference on Artificial Intelligence , pp. 444-453
    • Lee, K.H.1    Kay, J.2    Kang, B.H.3    Rosebrock, U.4
  • 28
    • 0002332781 scopus 로고    scopus 로고
    • Improving text classification by shrinkage in a hierarchy of classes
    • J.W. Shavlik. San Francisco, US, Madison, US: Morgan Kaufmann Publishers
    • McCallum A.K., Rosenfeld R., Mitchell T.M., Ng A.Y. Improving text classification by shrinkage in a hierarchy of classes. Shavlik J.W. Proceedings of ICML-98, 15th International Conference on Machine Learning. 1998;359-367 Morgan Kaufmann Publishers, San Francisco, US, Madison, US, http://www.cs.cmu.edu/mccallum/papers/hier-icml98.ps.gz.
    • (1998) Proceedings of ICML-98, 15th International Conference on Machine Learning , pp. 359-367
    • Mccallum, A.K.1    Rosenfeld, R.2    Mitchell, T.M.3    Ng, A.Y.4
  • 30
    • 0002442796 scopus 로고    scopus 로고
    • Machine learning in automated text categorization
    • Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys. 34(1):2002;1-47 http://faure.iei.pi.cnr.it/fabrizio/Publications/ACMCS02.pdf.
    • (2002) ACM Computing Surveys , vol.34 , Issue.1 , pp. 1-47
    • Sebastiani, F.1
  • 34
    • 0024868803 scopus 로고
    • Models for retrieval with probabilistic indexing
    • Fuhr N. Models for retrieval with probabilistic indexing. Information Processing and Management. 1(25):1989;55-72.
    • (1989) Information Processing and Management , vol.1 , Issue.25 , pp. 55-72
    • Fuhr, N.1
  • 35
    • 33646122183 scopus 로고    scopus 로고
    • Supervised term weighting for automated text categorization
    • ACM Press, New York, US, Melbourne, US, forthcoming
    • F. Debole, F. Sebastiani, Supervised term weighting for automated text categorization, in: Proceedings of SAC-03, 18th ACM Symposium on Applied Computing, ACM Press, New York, US, Melbourne, US, forthcoming. Available from 〈http://faure.iei.pi.cnr.it/fabrizio/Publications/SAC03b.pdf〉.
    • Proceedings of SAC-03, 18th ACM Symposium on Applied Computing
    • Debole, F.1    Sebastiani, F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.