메뉴 건너뛰기




Volumn 24, Issue 6, 2011, Pages 904-914

A new feature selection algorithm based on binomial hypothesis testing for spam filtering

Author keywords

Binomial distribution; Binomial hypothesis testing; Feature selection; Spam filtering; Text categorization

Indexed keywords

BINOMIAL DISTRIBUTION; BINOMIAL HYPOTHESIS TESTING; FEATURE SELECTION; SPAM FILTERING; TEXT CATEGORIZATION;

EID: 79957440082     PISSN: 09507051     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.knosys.2011.04.006     Document Type: Article
Times cited : (58)

References (40)
  • 3
    • 0031334221 scopus 로고    scopus 로고
    • Selection of relevant features and examples in machine learning
    • PII S0004370297000635
    • A.L. Blum, and P. Langley Selection of relevant features and examples in machine learning Artificial Intelligence 97 1997 245 271 (Pubitemid 127401106)
    • (1997) Artificial Intelligence , vol.97 , Issue.1-2 , pp. 245-271
    • Blum, A.L.1    Langley, P.2
  • 5
    • 58349094507 scopus 로고    scopus 로고
    • Feature selection for text classification with Naive Bayes
    • J. Chen, H. Huang, S. Tian, and Y. Qu Feature selection for text classification with Naive Bayes Expert Systems with Applications 36 2009 5432 5435
    • (2009) Expert Systems with Applications , vol.36 , pp. 5432-5435
    • Chen, J.1    Huang, H.2    Tian, S.3    Qu, Y.4
  • 6
    • 33747884473 scopus 로고    scopus 로고
    • A preprocess algorithm of filtering irrelevant information based on the minimum class difference
    • DOI 10.1016/j.knosys.2006.03.005, PII S0950705106000682
    • Z. Chen, and K. Lu A preprocess algorithm of filtering irrelevant information based on the minimum class difference Knowledge-Based System 19 2006 422 429 (Pubitemid 44293386)
    • (2006) Knowledge-Based Systems , vol.19 , Issue.6 , pp. 422-429
    • Chen, Z.1    Lu, K.2
  • 8
    • 29644438050 scopus 로고    scopus 로고
    • Statistical comparisons of classifiers over multiple data sets
    • J. Demšar Statistical comparisons of classifiers over multiple data sets Journal of Machine Learning Research 7 2006 1 30 (Pubitemid 43022939)
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1-30
    • Demsar, J.1
  • 10
    • 33646023117 scopus 로고    scopus 로고
    • An introduction to ROC analysis
    • T. Fawcett An introduction to ROC analysis Pattern Recognition Letters 27 2006 861 874
    • (2006) Pattern Recognition Letters , vol.27 , pp. 861-874
    • Fawcett, T.1
  • 11
    • 2942731012 scopus 로고    scopus 로고
    • An extensive empirical study of feature selection metrics for text classification
    • G. Forman An extensive empirical study of feature selection metrics for text classification Journal of Machine Learning Research 3 2003 1289 1305
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 1289-1305
    • Forman, G.1
  • 12
    • 19744376557 scopus 로고    scopus 로고
    • Best terms: An efficient feature-selection algorithm for text categorization
    • DOI 10.1007/s10115-004-0177-2
    • D. Fragoudis, D. Meretakis, and S. Likothanassis Best terms: an efficient feature-selection algorithm for text categorization Knowledge and Information Systems 8 2005 16 33 (Pubitemid 40743372)
    • (2005) Knowledge and Information Systems , vol.8 , Issue.1 , pp. 16-33
    • Fragoudis, D.1    Meretakis, D.2    Likothanassis, S.3
  • 14
    • 58149287952 scopus 로고    scopus 로고
    • An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons
    • S. García, and F. Herrera An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons Journal of Machine Learning Research 9 2008 2677 2694
    • (2008) Journal of Machine Learning Research , vol.9 , pp. 2677-2694
    • García, S.1    Herrera, F.2
  • 15
    • 67349246464 scopus 로고    scopus 로고
    • A review of machine learning approaches to spam filtering
    • T.S. Guzella, and W.M. Caminhas A review of machine learning approaches to spam filtering Expert Systems with Applications 36 2009 10206 10222
    • (2009) Expert Systems with Applications , vol.36 , pp. 10206-10222
    • Guzella, T.S.1    Caminhas, W.M.2
  • 16
    • 0002294347 scopus 로고
    • A simple sequentially rejective multiple test procedure
    • S. Holm A simple sequentially rejective multiple test procedure Scandinavian Journal of Statistics 6 1979 65 70
    • (1979) Scandinavian Journal of Statistics , vol.6 , pp. 65-70
    • Holm, S.1
  • 17
    • 0001750957 scopus 로고
    • Approximations of the critical region of the Friedman statistic
    • R.L. Iman, and J.M. Davenport Approximations of the critical region of the Friedman statistic Communications in Statistics 18 1980 571 579
    • (1980) Communications in Statistics , vol.18 , pp. 571-579
    • Iman, R.L.1    Davenport, J.M.2
  • 18
    • 84957069814 scopus 로고    scopus 로고
    • Text Categorization with Support Vector Machines: Learning with many Relevant Features
    • Machine Learning: ECML-98
    • Joachims, T., 1998. Text categorization with support vector machines: learning with many relevant features, in: C.N.C. Rouveirol (Ed.), Proceedings of the ECML-98, 10th European Conference on Machine Learning, Heidelberg, DE, Chemnitz, DE, pp. 137-142. (Pubitemid 128067178)
    • (1998) Lecture Notes in Computer Science , Issue.1398 , pp. 137-142
    • Joachims, T.1
  • 20
    • 23744432473 scopus 로고    scopus 로고
    • Information gain and divergence-based feature selection for machine learning-based text categorization
    • DOI 10.1016/j.ipm.2004.08.006, PII S0306457304000962
    • C. Lee, and G.G. Lee Information gain and divergence-based feature selection for machine learning-based text categorization Infromation Processing and Management 42 2006 155 165 (Pubitemid 41119082)
    • (2006) Information Processing and Management , vol.42 , Issue.1 SPEC. ISS , pp. 155-165
    • Lee, C.1    Lee, G.G.2
  • 22
    • 0002312061 scopus 로고
    • Feature selection and feature extraction for text categorization
    • Harriman, New York
    • D.D. Lewis, Feature selection and feature extraction for text categorization, in: Proceedings of the Workshop on Speech and Natural Language, Harriman, New York, 1992, pp. 212-217.
    • (1992) Proceedings of the Workshop on Speech and Natural Language , pp. 212-217
    • Lewis, D.D.1
  • 26
    • 0037375142 scopus 로고    scopus 로고
    • Feature selection on hierarchy of web documents
    • D. Mladenic, and M. Grobelnik Feature selection on hierarchy of web documents Decision Support Systems 35 2003 45 87
    • (2003) Decision Support Systems , vol.35 , pp. 45-87
    • Mladenic, D.1    Grobelnik, M.2
  • 27
    • 58349094495 scopus 로고    scopus 로고
    • Feature selection with a measure of deviations from Poisson in text categorization
    • H. Ogura, H. Amano, and M. Kondo Feature selection with a measure of deviations from Poisson in text categorization Expert Systems with Applications 36 2009 6826 6832
    • (2009) Expert Systems with Applications , vol.36 , pp. 6826-6832
    • Ogura, H.1    Amano, H.2    Kondo, M.3
  • 30
    • 1542634595 scopus 로고    scopus 로고
    • A statistical approach to the spam problem
    • G. Robinson A statistical approach to the spam problem Linux Journal 2003 2003 1075 3583
    • (2003) Linux Journal , vol.2003 , pp. 1075-3583
    • Robinson, G.1
  • 32
    • 0002442796 scopus 로고    scopus 로고
    • Machine learning in automated text categorization
    • F. Sebastiani Machine learning in automated text categorization ACM Computing Surveys 34 2002 1 47
    • (2002) ACM Computing Surveys , vol.34 , pp. 1-47
    • Sebastiani, F.1
  • 33
    • 33845622338 scopus 로고    scopus 로고
    • A novel feature selection algorithm for text categorization
    • DOI 10.1016/j.eswa.2006.04.001, PII S095741740600114X
    • W. Shang, H. Huang, and H. Zhu A novel feature selection algorithm for text categorization Expert Systems with Applications 33 2007 1 5 (Pubitemid 44959912)
    • (2007) Expert Systems with Applications , vol.33 , Issue.1 , pp. 1-5
    • Shang, W.1    Huang, H.2    Zhu, H.3    Lin, Y.4    Qu, Y.5    Wang, Z.6
  • 34
    • 27844480603 scopus 로고    scopus 로고
    • A comparative study on text representation schemes in text categorization
    • DOI 10.1007/s10044-005-0256-3
    • F. Song, S. Liu, and J. Yang A comparative study on text representation schemes in text categorization Pattern Analysis & Applications 8 2005 199 209 (Pubitemid 41649106)
    • (2005) Pattern Analysis and Applications , vol.8 , Issue.1-2 , pp. 199-209
    • Song, F.1    Liu, S.2    Yang, J.3
  • 37
    • 56649119052 scopus 로고    scopus 로고
    • Recommendation based on rational inferences in collaborative filtering
    • J.-M. Yang, and K.F. Li Recommendation based on rational inferences in collaborative filtering Knowledge-Based Systems 22 2009 105 114
    • (2009) Knowledge-Based Systems , vol.22 , pp. 105-114
    • Yang, J.-M.1    Li, K.F.2
  • 39
    • 60249093947 scopus 로고    scopus 로고
    • Class dependent feature scaling method using Naive Bayes classifier for text datamining
    • E. Youn, and M.K. Jeong Class dependent feature scaling method using Naive Bayes classifier for text datamining Pattern Recognition Letters 30 2009 477 485
    • (2009) Pattern Recognition Letters , vol.30 , pp. 477-485
    • Youn, E.1    Jeong, M.K.2
  • 40
    • 67349121244 scopus 로고    scopus 로고
    • Combining neural networks and semantic feature space for email classification
    • B. Yu, and D.-h. Zhu Combining neural networks and semantic feature space for email classification Knowledge-Based Systems 22 2009 376 381
    • (2009) Knowledge-Based Systems , vol.22 , pp. 376-381
    • Yu, B.1    Zhu, D.-H.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.