메뉴 건너뛰기




Volumn 2006, Issue , 2006, Pages 138-146

Extending the single words-based document model: A comparison of bigrams and 2-itemsets

Author keywords

Bigrams; Comparison; Document model; Feature selection; Itemsets; Machine learning; N grams; Text categorization

Indexed keywords

ALGORITHMS; CLASSIFICATION (OF INFORMATION); FEATURE EXTRACTION; LEARNING SYSTEMS; TEXT PROCESSING; WORD PROCESSING;

EID: 34247393724     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1166160.1166197     Document Type: Conference Paper
Times cited : (21)

References (26)
  • 2
    • 78149287585 scopus 로고    scopus 로고
    • M.L. Antonie and O.R. Zaiane. Text document categorization by term association. In Proc. of the IEEE 2002 International Conference on Data Mining, pages 19-26, Maebashi City, Japan, 2002.
    • M.L. Antonie and O.R. Zaiane. Text document categorization by term association. In Proc. of the IEEE 2002 International Conference on Data Mining, pages 19-26, Maebashi City, Japan, 2002.
  • 7
    • 34247360082 scopus 로고    scopus 로고
    • M. F. Caropreso, S. Matwin and F. Sebastiani. Statistical phrases in automated text categorization. Technical Report IEI-B4-07-2000, Istituto di Elaborazione dell'Informazione, Pisa, Italy, 2000.
    • M. F. Caropreso, S. Matwin and F. Sebastiani. Statistical phrases in automated text categorization. Technical Report IEI-B4-07-2000, Istituto di Elaborazione dell'Informazione, Pisa, Italy, 2000.
  • 8
    • 0000259511 scopus 로고    scopus 로고
    • Approximate Statistical Test for Comparing Supervised Classification Learning Algorithms
    • T.G. Dietterich. Approximate Statistical Test for Comparing Supervised Classification Learning Algorithms. Neural Computation, Vol. 10, no.7, pp. 1895-1923, 1998.
    • (1998) Neural Computation , vol.10 , Issue.7 , pp. 1895-1923
    • Dietterich, T.G.1
  • 9
    • 34247348730 scopus 로고    scopus 로고
    • S.T. Dumais, J. Platt, D. Heckerman and M. Sahami. Inductive learning algorithms and representations for text categorization. In Proceedings of ACM-CIKM98, pp. 148-155, 1998.
    • S.T. Dumais, J. Platt, D. Heckerman and M. Sahami. Inductive learning algorithms and representations for text categorization. In Proceedings of ACM-CIKM98, pp. 148-155, 1998.
  • 10
    • 0006291183 scopus 로고    scopus 로고
    • A study using n-gram features for text categorization
    • Technical Report OEFAI-TR-9830, Austrian Institute for Artificial Intelligence, Vienna, Austria
    • J. Fürnkranz. A study using n-gram features for text categorization. Technical Report OEFAI-TR-9830, Austrian Institute for Artificial Intelligence, Vienna, Austria, 1998.
    • (1998)
    • Fürnkranz, J.1
  • 11
    • 0012547676 scopus 로고    scopus 로고
    • A Comparison of Event Models for Naive Bayes Text Classification. In AAAI/ICML-98 Workshop on Learning for Text Categorization
    • Technical Report WS-98-05, AAAI Press, pp
    • A. McCallum and K. Nigam. A Comparison of Event Models for Naive Bayes Text Classification. In AAAI/ICML-98 Workshop on Learning for Text Categorization, Technical Report WS-98-05, AAAI Press, pp. 41-48, 1998.
    • (1998) , pp. 41-48
    • McCallum, A.1    Nigam, K.2
  • 12
    • 34247327481 scopus 로고    scopus 로고
    • D. Meretakis and B. Wüthrich. Extending Naive Bayes classifiers using long itemsets. In Proc. 5th ACM-SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD'99), San Diego, USA, pp. 165-174, 1999.
    • D. Meretakis and B. Wüthrich. Extending Naive Bayes classifiers using long itemsets. In Proc. 5th ACM-SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD'99), San Diego, USA, pp. 165-174, 1999.
  • 14
    • 34247360510 scopus 로고    scopus 로고
    • M. Mitra, C. Buckley, A. Singhal and C. Cardie. An analysis of statistical and syntactic phrases. In Proceedings of RIAO-97, 5th International Conference Recherche d'Information Assistee par Ordinateur, pages 200-214, Montreal, CA, 1997.
    • M. Mitra, C. Buckley, A. Singhal and C. Cardie. An analysis of statistical and syntactic phrases. In Proceedings of RIAO-97, 5th International Conference "Recherche d'Information Assistee par Ordinateur", pages 200-214, Montreal, CA, 1997.
  • 15
    • 34247393769 scopus 로고    scopus 로고
    • D. Mladenic and M. Grobelnik. Word sequences as features in text-learning. In Proc. 17th Electrotechnical and Computer Science Conference (ERK98), Slovenia, 1998.
    • D. Mladenic and M. Grobelnik. Word sequences as features in text-learning. In Proc. 17th Electrotechnical and Computer Science Conference (ERK98), Slovenia, 1998.
  • 16
    • 34247352127 scopus 로고    scopus 로고
    • D. Mladenic and M. Grobelnik. Feature Selection for Unbalanced Class Distribution and Naive Bayes. In Proceedings of the 16th International Conference on Machine Learning, Morgan Kaufmann, pp. 258-267, 1999.
    • D. Mladenic and M. Grobelnik. Feature Selection for Unbalanced Class Distribution and Naive Bayes. In Proceedings of the 16th International Conference on Machine Learning, Morgan Kaufmann, pp. 258-267, 1999.
  • 17
    • 34247392939 scopus 로고    scopus 로고
    • V. Pekar, M. Krkoska and S. Staab. Feature Weighting for Co-occurrence-based Classification of Words. In Proceedings of the 20th Conference on Computational Linguistics, COLING-2004, August 2004.
    • V. Pekar, M. Krkoska and S. Staab. Feature Weighting for Co-occurrence-based Classification of Words. In Proceedings of the 20th Conference on Computational Linguistics, COLING-2004, August 2004.
  • 19
    • 34247348731 scopus 로고    scopus 로고
    • K.M. Schneider. A New Feature Selection Score for Multinomial Naive Bayes Text Classification Based on KL-Divergence. 42nd Meeting of the Association for Computational Linguistics (ACL 2004), pp. 186-189, 2004.
    • K.M. Schneider. A New Feature Selection Score for Multinomial Naive Bayes Text Classification Based on KL-Divergence. 42nd Meeting of the Association for Computational Linguistics (ACL 2004), pp. 186-189, 2004.
  • 20
    • 0036643010 scopus 로고    scopus 로고
    • The use of bigrams to enhance text categorization. Information Processing and Management
    • July
    • Ch.M. Tan, Y.F. Wang and Ch.D. Lee. The use of bigrams to enhance text categorization. Information Processing and Management: an International Journal, v.38 n.4, p.529-546, July 2002.
    • (2002) an International Journal , vol.38 , Issue.4 , pp. 529-546
    • Tan, C.M.1    Wang, Y.F.2    Lee, C.D.3
  • 24
    • 0032268443 scopus 로고    scopus 로고
    • O. Zamir and O. Etzioni. Web document clustering: A feasibility demonstration. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '98, Melbourne, Australia), W. B. Croft, A. Moffat, C. J. van Rijsbergen, R., Wilkinson, and J. Zobel, Chairs. ACM Press, New York, NY, pp. 46-54, 1998.
    • O. Zamir and O. Etzioni. Web document clustering: A feasibility demonstration. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '98, Melbourne, Australia), W. B. Croft, A. Moffat, C. J. van Rijsbergen, R., Wilkinson, and J. Zobel, Chairs. ACM Press, New York, NY, pp. 46-54, 1998.
  • 26
    • 0035788918 scopus 로고    scopus 로고
    • Z. Zheng, R. Kohavi and L. Mason. Real world performance of association rule algorithms. In Int. Conf on Knowledge Discovery and Data Mining (SIGKDD), August 2001.
    • Z. Zheng, R. Kohavi and L. Mason. Real world performance of association rule algorithms. In Int. Conf on Knowledge Discovery and Data Mining (SIGKDD), August 2001.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.