메뉴 건너뛰기




Volumn 41, Issue 2, 2014, Pages 321-330

A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition

Author keywords

Attrition; Imbalanced class distribution; Prediction; Sampling; Sensitivity analysis; SMOTE; Student retention

Indexed keywords

ACADEMIC INSTITUTIONS; ATTRITION; CLASSIFICATION METHODS; CLASSIFICATION PERFORMANCE; CLASSIFICATION TECHNIQUE; IMBALANCED CLASS; SMOTE; STUDENT RETENTION;

EID: 84885955704     PISSN: 09574174     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.eswa.2013.07.046     Document Type: Article
Times cited : (160)

References (52)
  • 2
    • 80255133264 scopus 로고    scopus 로고
    • An experimental comparison of classification algorithms for imbalanced credit scoring data sets
    • I. Brown, and C. Mues An experimental comparison of classification algorithms for imbalanced credit scoring data sets Expert Systems with Applications 39 3 2012 3446 3453
    • (2012) Expert Systems with Applications , vol.39 , Issue.3 , pp. 3446-3453
    • Brown, I.1    Mues, C.2
  • 3
    • 23844495067 scopus 로고    scopus 로고
    • The use of discriminant analysis to investigate the influence of non-cognitive factors on engineering school persistence
    • J. Burtner The use of discriminant analysis to investigate the influence of non-cognitive factors on engineering school persistence Journal of Engineering Education 94 2005 335 338 (Pubitemid 41165341)
    • (2005) Journal of Engineering Education , vol.94 , Issue.3 , pp. 335-338
    • Burtner, J.1
  • 4
    • 0010944392 scopus 로고
    • Structural equations modeling test of an integrated model of student retention
    • A.F. Cabrera, J.-E.A. Nora, and M.B. Castafne Structural equations modeling test of an integrated model of student retention The Journal of Higher Education 64 2 1993 123 139
    • (1993) The Journal of Higher Education , vol.64 , Issue.2 , pp. 123-139
    • Cabrera, A.F.1    Nora, J.-E.A.2    Castafne, M.B.3
  • 6
    • 9444297357 scopus 로고    scopus 로고
    • SMOTEBoost: Improving prediction of the minority class in boosting
    • Knowledge Discovery in Databases: PKDD 2003
    • Chawla, N. V., Lazarevic, A., Hall, L. O., & Bowyer, K. W. (2003). SMOTEBoost: Improving prediction of the minority class in boosting. In Seventh European conference on principles and practice of knowledge discovery in databases (PKDD) (pp. 107-119). Dubrovnik, Croatia. (Pubitemid 37231089)
    • (2003) Lecture notes in computer science , Issue.2838 , pp. 107-119
    • Chawla, N.V.1    Lazarevic, A.2    Hall, L.O.3    Bowyer, K.W.4
  • 7
    • 51849110761 scopus 로고    scopus 로고
    • Predictors of first-year student retention in the community college
    • F. David, and F. Renea Predictors of first-year student retention in the community college Community College Review 36 2 2008 68 88
    • (2008) Community College Review , vol.36 , Issue.2 , pp. 68-88
    • David, F.1    Renea, F.2
  • 8
    • 78049421754 scopus 로고    scopus 로고
    • A comparative analysis of machine learning techniques for student retention management
    • D. Delen A comparative analysis of machine learning techniques for student retention management Decision Support Systems 49 4 2010 498 506
    • (2010) Decision Support Systems , vol.49 , Issue.4 , pp. 498-506
    • Delen, D.1
  • 9
    • 80051745438 scopus 로고    scopus 로고
    • Predicting student attrition with data mining methods
    • D. Delen Predicting student attrition with data mining methods Journal of College Student Retention 13 1 2011 17 35
    • (2011) Journal of College Student Retention , vol.13 , Issue.1 , pp. 17-35
    • Delen, D.1
  • 10
    • 27344432474 scopus 로고    scopus 로고
    • C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling
    • ICML. Washington, DC
    • Drummond, C., & Holte, R. C. (2003). C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II, ICML. Washington, DC.
    • (2003) Workshop on Learning from Imbalanced Datasets II
    • Drummond, C.1    Holte, R.C.2
  • 11
    • 81855176118 scopus 로고    scopus 로고
    • Comparing alternative classifiers for database marketing: The case of imbalanced datasets
    • E. Duman, Y. Ekinci, and A. Tanriverdi Comparing alternative classifiers for database marketing: The case of imbalanced datasets Expert Systems with Applications 39 2012 48 53
    • (2012) Expert Systems with Applications , vol.39 , pp. 48-53
    • Duman, E.1    Ekinci, Y.2    Tanriverdi, A.3
  • 12
    • 23944458290 scopus 로고    scopus 로고
    • The academic performance and retention of college of agriculture students
    • B.L. Garton, and A.L. Ball The academic performance and retention of college of agriculture students Journal of Agricultural Education 43 1 2002 46 56
    • (2002) Journal of Agricultural Education , vol.43 , Issue.1 , pp. 46-56
    • Garton, B.L.1    Ball, A.L.2
  • 13
    • 84885956055 scopus 로고    scopus 로고
    • A logistic regression model for the enhancement of student retention: The identification of at-risk freshmen
    • J.G. Glynn, P.L. Sauer, and T.E. Miller A logistic regression model for the enhancement of student retention: The identification of at-risk freshmen International Business & Economics Research Journal 1 8 2002 79 86
    • (2002) International Business & Economics Research Journal , vol.1 , Issue.8 , pp. 79-86
    • Glynn, J.G.1    Sauer, P.L.2    Miller, T.E.3
  • 14
    • 84856534206 scopus 로고    scopus 로고
    • A Kolmogorov-Smirnov statistic based segmentation approach to learning from imbalanced datasets: With application in property refinance prediction
    • R. Gong, and S.H. Huang A Kolmogorov-Smirnov statistic based segmentation approach to learning from imbalanced datasets: With application in property refinance prediction Expert Systems with Applications 39 2012 6192 6200
    • (2012) Expert Systems with Applications , vol.39 , pp. 6192-6200
    • Gong, R.1    Huang, S.H.2
  • 15
    • 27144479454 scopus 로고    scopus 로고
    • Learning from imbalanced data sets with boosting and data generation: The DataBoost-IM approach
    • H. Guo, and H.L. Viktor Learning from imbalanced data sets with boosting and data generation: The DataBoost-IM approach SIGKDD Explorations Newsletter 6 2004 30 39
    • (2004) SIGKDD Explorations Newsletter , vol.6 , pp. 30-39
    • Guo, H.1    Viktor, H.L.2
  • 17
    • 27144501672 scopus 로고    scopus 로고
    • Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning
    • Advances in Intelligent Computing: International Conference on Intelligent Computing, ICIC 2005. Proceedings
    • Han, H., Wang, W. -Y., & Mao, B. -H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In First international conference on intelligent computing (ICIC) (pp. 878-887). Hefei, China. (Pubitemid 41491129)
    • (2005) Lecture Notes in Computer Science , vol.3644 , Issue.PART I , pp. 878-887
    • Han, H.1    Wang, W.-Y.2    Mao, B.-H.3
  • 18
    • 67650261808 scopus 로고    scopus 로고
    • Estimating student retention and degree-completion time: Decision trees and neural networks vis- A -vis regression
    • John Wiley & Sons, Inc. Published online in Wiley Interscience
    • S. Herzog Estimating student retention and degree-completion time: Decision trees and neural networks vis-à-vis regression New Directions for Institutional Research 131 2006 17 33 John Wiley & Sons, Inc. Published online in Wiley Interscience
    • (2006) New Directions for Institutional Research , vol.131 , pp. 17-33
    • Herzog, S.1
  • 19
    • 84874229564 scopus 로고    scopus 로고
    • Predicting student success by mining enrolment data
    • Z.J. Kovačić Predicting student success by mining enrolment data Research in Higher Education Journal 15 2012 1 20
    • (2012) Research in Higher Education Journal , vol.15 , pp. 1-20
    • Kovacic, Z.J.1
  • 20
    • 84864664739 scopus 로고    scopus 로고
    • Mining academic data to improve college student retention: An open source perspective
    • Vancouver, BC, Canada
    • Lauría, E. J. M., Baron, J. D., Devireddy, M., Sundararaju, V., & Jayaprakash, S. M. (2012). Mining academic data to improve college student retention: An open source perspective. In Second international conference on learning analytics and knowledge (pp. 139-142) Vancouver, BC, Canada.
    • (2012) Second International Conference on Learning Analytics and Knowledge , pp. 139-142
    • Lauría, E.J.M.1
  • 22
    • 77952554315 scopus 로고    scopus 로고
    • A learning method for the class imbalance problem with medical data sets
    • D.-C. Li, C.-W. Liu, and S.C. Hu A learning method for the class imbalance problem with medical data sets Computers in Biology and Medicine 40 2010 509 518
    • (2010) Computers in Biology and Medicine , vol.40 , pp. 509-518
    • Li, D.-C.1    Liu, C.-W.2    Hu, S.C.3
  • 23
    • 84455191888 scopus 로고    scopus 로고
    • Forecasting business failure: The use of nearest-neighbour support vectors and correcting imbalanced samples - Evidence from the Chinese hotel industry
    • H. Li, and J. Sun Forecasting business failure: The use of nearest-neighbour support vectors and correcting imbalanced samples - evidence from the Chinese hotel industry Tourism Management 33 3 2012 622 634
    • (2012) Tourism Management , vol.33 , Issue.3 , pp. 622-634
    • Li, H.1    Sun, J.2
  • 24
    • 70350635665 scopus 로고    scopus 로고
    • Development of a classification system for engineering student characteristics affecting college enrollment and retention
    • Q. Li, H. Swaminathan, and J. Tang Development of a classification system for engineering student characteristics affecting college enrollment and retention Journal of Engineering Education 98 2009 361 376
    • (2009) Journal of Engineering Education , vol.98 , pp. 361-376
    • Li, Q.1    Swaminathan, H.2    Tang, J.3
  • 25
    • 44949084506 scopus 로고    scopus 로고
    • Classification of weld flaws with imbalanced class data
    • T.W. Liao Classification of weld flaws with imbalanced class data Expert Systems with Applications 35 2008 1041 1052
    • (2008) Expert Systems with Applications , vol.35 , pp. 1041-1052
    • Liao, T.W.1
  • 26
    • 84885953593 scopus 로고    scopus 로고
    • Data mining for student retention management
    • S.H. Lin Data mining for student retention management Journal of Computing Sciences in Colleges 27 4 2012 92 99
    • (2012) Journal of Computing Sciences in Colleges , vol.27 , Issue.4 , pp. 92-99
    • Lin, S.H.1
  • 34
    • 82255192254 scopus 로고    scopus 로고
    • Comparative analysis of data mining models for bankruptcy prediction
    • D.L. Olson, D. Delen, and Y. Meng Comparative analysis of data mining models for bankruptcy prediction Decision Support Systems 52 2 2012 464 473
    • (2012) Decision Support Systems , vol.52 , Issue.2 , pp. 464-473
    • Olson, D.L.1    Delen, D.2    Meng, Y.3
  • 35
    • 0002911219 scopus 로고
    • Logistic regression analysis of graduate student retention
    • S.W. Pyke, and P.M. Sheridan Logistic regression analysis of graduate student retention Canadian Journal of Higher Education 23 2 1993 44 64
    • (1993) Canadian Journal of Higher Education , vol.23 , Issue.2 , pp. 44-64
    • Pyke, S.W.1    Sheridan, P.M.2
  • 36
    • 33744584654 scopus 로고
    • Induction of decision trees
    • J.R. Quinlan Induction of decision trees Machine Learning 1 1 1986 81 106
    • (1986) Machine Learning , vol.1 , Issue.1 , pp. 81-106
    • Quinlan, J.R.1
  • 39
    • 32344437581 scopus 로고    scopus 로고
    • Predictors of academic achievement and retention among College freshmen: A longitudinal study
    • D.M. Scott, G.I. Spielmans, and D.C. Julka Predictors of academic achievement and retention among College freshmen: A longitudinal study College Student Journal 38 1 2004 66 80
    • (2004) College Student Journal , vol.38 , Issue.1 , pp. 66-80
    • Scott, D.M.1    Spielmans, G.I.2    Julka, D.C.3
  • 40
    • 33744797928 scopus 로고    scopus 로고
    • Knowledge acquisition through information granulation for imbalanced data
    • DOI 10.1016/j.eswa.2005.09.082, PII S0957417405002678
    • C.-T. Su, L.-S. Chen, and Y. Yih Knowledge acquisition through information granulation for imbalanced data Expert Systems with Applications 31 2006 531 541 (Pubitemid 43831633)
    • (2006) Expert Systems with Applications , vol.31 , Issue.3 , pp. 531-541
    • Su, C.-T.1    Chen, L.-S.2    Yih, Y.3
  • 44
    • 32144432097 scopus 로고    scopus 로고
    • A classification approach for power distribution systems fault cause identification
    • DOI 10.1109/TPWRS.2005.861981
    • L. Xu, and M.-Y. Chow A classification approach for power distribution systems fault cause identification IEEE Transactions on Power Systems 21 1 2006 53 60 (Pubitemid 43204702)
    • (2006) IEEE Transactions on Power Systems , vol.21 , Issue.1 , pp. 53-60
    • Xu, L.1    Chow, M.-Y.2
  • 46
    • 84872558806 scopus 로고    scopus 로고
    • Data mining: A prediction for performance improvement of engineering students using classification
    • S.K. Yadav, and S. Pal Data mining: A prediction for performance improvement of engineering students using classification World of Computer Science and Information Technology Journal (WCSIT) 2 2 2012 51 56
    • (2012) World of Computer Science and Information Technology Journal (WCSIT) , vol.2 , Issue.2 , pp. 51-56
    • Yadav, S.K.1    Pal, S.2
  • 47
    • 84861173945 scopus 로고    scopus 로고
    • Prediction of liquefaction potential based on CPT up-sampling
    • J.S. Yazdi, F. Kalantary, and H.S. Yazdi Prediction of liquefaction potential based on CPT up-sampling Computers & Geosciences 44 0 2012 10 23
    • (2012) Computers & Geosciences , vol.44 , Issue.0 , pp. 10-23
    • Yazdi, J.S.1    Kalantary, F.2    Yazdi, H.S.3
  • 48
    • 84858782597 scopus 로고    scopus 로고
    • A data mining approach for identifying predictors of student retention from sophomore to junior year
    • C.H. Yu, S. DiGangi, A. Jannasch-Pennell, and C. Kaprolet A data mining approach for identifying predictors of student retention from sophomore to junior year Journal of Data Science 8 2010 307 325
    • (2010) Journal of Data Science , vol.8 , pp. 307-325
    • Yu, C.H.1    Digangi, S.2    Jannasch-Pennell, A.3    Kaprolet, C.4
  • 51
    • 8644252009 scopus 로고    scopus 로고
    • Identifying factors influencing engineering student graduation: A longitudinal and cross-institutional study
    • G. Zhang, T.J. Anderson, M.W. Ohland, and B.R. Thorndyke Identifying factors influencing engineering student graduation: A longitudinal and cross-institutional study Journal of Engineering Education 93 4 2004 313 320
    • (2004) Journal of Engineering Education , vol.93 , Issue.4 , pp. 313-320
    • Zhang, G.1    Anderson, T.J.2    Ohland, M.W.3    Thorndyke, B.R.4
  • 52
    • 31344442851 scopus 로고    scopus 로고
    • Training cost-sensitive neural networks with methods addressing the class imbalance problem
    • DOI 10.1109/TKDE.2006.17
    • Z.-H. Zhou, and X.-Y. Liu Training cost-sensitive neural networks with methods addressing the class imbalance problem IEEE Transactions on Knowledge and Data Engineering 18 2006 63 77 (Pubitemid 43145089)
    • (2006) IEEE Transactions on Knowledge and Data Engineering , vol.18 , Issue.1 , pp. 63-77
    • Zhou, Z.-H.1    Liu, X.-Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.