메뉴 건너뛰기




Volumn 1, Issue 3, 2011, Pages 183-200

Spam filtering: How the dimensionality reduction affects the accuracy of Naive Bayes classifiers

Author keywords

Classification; Dimensionality reduction; Machine learning; Spam filter; Text categorization

Indexed keywords

CLASSIFICATION; DIMENSIONALITY REDUCTION; MACHINE LEARNING; SPAM FILTER; TEXT CATEGORIZATION;

EID: 79952048598     PISSN: 18674828     EISSN: 18690238     Source Type: Journal    
DOI: 10.1007/s13174-010-0014-7     Document Type: Article
Times cited : (64)

References (49)
  • 8
    • 0033931867 scopus 로고    scopus 로고
    • Assessing the accuracy of prediction algorithms for classification: an overview
    • Baldi P, Brunak S, Chauvin Y, Andersen C, Nielsen H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5):412-424
    • (2000) Bioinformatics , vol.16 , Issue.5 , pp. 412-424
    • Baldi, P.1    Brunak, S.2    Chauvin, Y.3    Andersen, C.4    Nielsen, H.5
  • 10
    • 33751076845 scopus 로고    scopus 로고
    • Tightening the Net: a review of current and next generation spam filtering tools
    • Carpinter J, Hunt R (2006) Tightening the Net: a review of current and next generation spam filtering tools. Comput Secur 25(8):566-578
    • (2006) Comput Secur , vol.25 , Issue.8 , pp. 566-578
    • Carpinter, J.1    Hunt, R.2
  • 14
    • 47649092539 scopus 로고    scopus 로고
    • Email spam filtering: a systematic review
    • Cormack G (2008) Email spam filtering: a systematic review. Found Trends Inf Retr 1(4):335-455
    • (2008) Found Trends Inf Retr , vol.1 , Issue.4 , pp. 335-455
    • Cormack, G.1
  • 15
    • 34547480985 scopus 로고    scopus 로고
    • Online supervised spam filter evaluation
    • Cormack G, Lynam T (2007) Online supervised spam filter evaluation. ACM Trans Inf Syst 25(3):1-11
    • (2007) ACM Trans Inf Syst , vol.25 , Issue.3 , pp. 1-11
    • Cormack, G.1    Lynam, T.2
  • 17
    • 29644438050 scopus 로고    scopus 로고
    • Statistical comparisons of classifiers over multiple data sets
    • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1-30
    • (2006) J Mach Learn Res , vol.7 , pp. 1-30
    • Demsar, J.1
  • 18
    • 0032594950 scopus 로고    scopus 로고
    • Support vector machines for spam categorization
    • Drucker H, Wu D, Vapnik V (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw 10(5):1048-1054
    • (1999) IEEE Trans Neural Netw , vol.10 , Issue.5 , pp. 1048-1054
    • Drucker, H.1    Wu, D.2    Vapnik, V.3
  • 19
    • 2942731012 scopus 로고    scopus 로고
    • An extensive empirical study of feature selection metrics for text classification
    • Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289-1305
    • (2003) J Mach Learn Res , vol.3 , pp. 1289-1305
    • Forman, G.1
  • 23
    • 84976776599 scopus 로고
    • A probabilistic learning approach for document indexing
    • Fuhr N, Buckley C (1991) A probabilistic learning approach for document indexing. ACM Trans Inf Syst 9(3):223-248
    • (1991) ACM Trans Inf Syst , vol.9 , Issue.3 , pp. 223-248
    • Fuhr, N.1    Buckley, C.2
  • 25
    • 67349246464 scopus 로고    scopus 로고
    • A review of machine learning approaches to spam filtering
    • Guzella T, Caminhas W (2000) A review of machine learning approaches to spam filtering. Exp Syst Appl 36(7):10206-10222
    • (2000) Exp Syst Appl , vol.36 , Issue.7 , pp. 10206-10222
    • Guzella, T.1    Caminhas, W.2
  • 26
    • 0036040906 scopus 로고    scopus 로고
    • Evaluating cost-sensitive unsolicited bulk email categorization
    • Madrid, Spain
    • Hidalgo J (2002) Evaluating cost-sensitive unsolicited bulk email categorization. In: Proceedings of the 17th ACM symposium on applied computing, Madrid, Spain, pp 615-620
    • (2002) Proceedings of the 17th ACM symposium on applied computing , pp. 615-620
    • Hidalgo, J.1
  • 27
    • 0002409860 scopus 로고    scopus 로고
    • A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization
    • Nashville, TN, USA
    • Joachims T (1997) A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: Proceedings of 14th international conference on machine learning, Nashville, TN, USA, pp 143-151
    • (1997) Proceedings of 14th international conference on machine learning , pp. 143-151
    • Joachims, T.1
  • 32
    • 33847673255 scopus 로고    scopus 로고
    • Learning to classify e-mail
    • Koprinska I, Poon J, Clark J, Chan J (2007) Learning to classify e-mail. Inf Sci 177(10):2167-2187
    • (2007) Inf Sci , vol.177 , Issue.10 , pp. 2167-2187
    • Koprinska, I.1    Poon, J.2    Clark, J.3    Chan, J.4
  • 33
    • 22044443560 scopus 로고    scopus 로고
    • Scale and translation invariant collaborative filtering systems
    • Lemire D (2005) Scale and translation invariant collaborative filtering systems. Inf Retr 8(1):129-150
    • (2005) Inf Retr , vol.8 , Issue.1 , pp. 129-150
    • Lemire, D.1
  • 34
    • 46249088392 scopus 로고    scopus 로고
    • Assessing multivariate Bernoulli models for information retrieval
    • Losada D, Azzopardi L (2008) Assessing multivariate Bernoulli models for information retrieval. ACM Trans Inf Syst 26(3):1-46
    • (2008) ACM Trans Inf Syst , vol.26 , Issue.3 , pp. 1-46
    • Losada, D.1    Azzopardi, L.2
  • 35
    • 61749086696 scopus 로고    scopus 로고
    • Targeting spam control on middleboxes: spam detection based on layer-3 e-mail content classification
    • Marsono M, El-Kharashi N, Gebali F (2009) Targeting spam control on middleboxes: spam detection based on layer-3 e-mail content classification. Comput Netw 53(6):835-848
    • (2009) Comput Netw , vol.53 , Issue.6 , pp. 835-848
    • Marsono, M.1    El-Kharashi, N.2    Gebali, F.3
  • 36
    • 0016772212 scopus 로고
    • Comparison of the predicted and observed secondary structure of T4 phage lysozyme
    • Matthews B (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405(2):442-451
    • (1975) Biochim Biophys Acta , vol.405 , Issue.2 , pp. 442-451
    • Matthews, B.1
  • 44
    • 0002442796 scopus 로고    scopus 로고
    • Machine learning in automated text categorization
    • Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1-47
    • (2002) ACM Comput Surv , vol.34 , Issue.1 , pp. 1-47
    • Sebastiani, F.1
  • 45
    • 36549012987 scopus 로고    scopus 로고
    • An evaluation of Naive Bayes variants in content-based learning for spam filtering
    • Seewald A (2007) An evaluation of Naive Bayes variants in content-based learning for spam filtering. Int Data Anal 11(5):497-524
    • (2007) Int Data Anal , vol.11 , Issue.5 , pp. 497-524
    • Seewald, A.1
  • 46
    • 67650834914 scopus 로고    scopus 로고
    • Better Naive Bayes classification for high-precision spam detection
    • Song Y, Kolcz A, Gilez C (2009) Better Naive Bayes classification for high-precision spam detection. Softw Pract Exp 39(11):1003-1024
    • (2009) Softw Pract Exp , vol.39 , Issue.11 , pp. 1003-1024
    • Song, Y.1    Kolcz, A.2    Gilez, C.3
  • 49
    • 34248666540 scopus 로고
    • Fuzzy sets
    • Zadeh L (1965) Fuzzy sets. Inf Control 8(3):338-353
    • (1965) Inf Control , vol.8 , Issue.3 , pp. 338-353
    • Zadeh, L.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.