메뉴 건너뛰기




Volumn 33, Issue 2, 2012, Pages 245-265

SMOTE-RSB*: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory

Author keywords

Classification; Data preparation; Imbalanced data sets; Oversampling; Rough sets theory; Undersampling

Indexed keywords

APPROXIMATION ALGORITHMS; CLASSIFICATION (OF INFORMATION); LEARNING ALGORITHMS;

EID: 84867715887     PISSN: 02191377     EISSN: 02193116     Source Type: Journal    
DOI: 10.1007/s10115-011-0465-6     Document Type: Article
Times cited : (408)

References (49)
  • 4
    • 27144531570 scopus 로고    scopus 로고
    • A study of the behaviour of several methods for balancing machine learning training data
    • Batista GEAPA, Prati RC, Monard MC (2004) A study of the behaviour of several methods for balancing machine learning training data. SIGKDD Explor 6(1): 20-29.
    • (2004) SIGKDD Explor , vol.6 , Issue.1 , pp. 20-29
    • Batista, G.E.A.P.A.1    Prati, R.C.2    Monard, M.C.3
  • 6
    • 0031191630 scopus 로고    scopus 로고
    • The use of the Area Under the ROC Curve in the evaluation of machine learning algorithms
    • Bradley AP (1997) The use of the Area Under the ROC Curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7): 1145-1159.
    • (1997) Pattern Recognit , vol.30 , Issue.7 , pp. 1145-1159
    • Bradley, A.P.1
  • 9
    • 27144549260 scopus 로고    scopus 로고
    • Editorial: special issue on learning from imbalanced data sets
    • Chawla NV, Japkowicz N, Kolcz A (2004) Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor 6(1): 1-6.
    • (2004) SIGKDD Explor , vol.6 , Issue.1 , pp. 1-6
    • Chawla, N.V.1    Japkowicz, N.2    Kolcz, A.3
  • 10
    • 50549101751 scopus 로고    scopus 로고
    • Automatically countering imbalance and its empirical relationship to cost
    • Chawla NV, Cieslak D, Hall L, Joshi A (2008) Automatically countering imbalance and its empirical relationship to cost. Data Min Knowl Discov 17(2): 225-252.
    • (2008) Data Min Knowl Discov , vol.17 , Issue.2 , pp. 225-252
    • Chawla, N.V.1    Cieslak, D.2    Hall, L.3    Joshi, A.4
  • 11
    • 77957561626 scopus 로고    scopus 로고
    • Forecasting PGR of the financial industry using a rough sets classifier based on attribute-granularity
    • Chen Y-S, Cheng C-H (2010) Forecasting PGR of the financial industry using a rough sets classifier based on attribute-granularity. Knowl Inf Syst 25(1): 57-79.
    • (2010) Knowl Inf Syst , vol.25 , Issue.1 , pp. 57-79
    • Chen, Y.-S.1    Cheng, C.-H.2
  • 12
    • 29644438050 scopus 로고    scopus 로고
    • Statistical comparisons of classifiers over multiple data sets
    • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1-30.
    • (2006) J Mach Learn Res , vol.7 , pp. 1-30
    • Demšar, J.1
  • 13
    • 46849096083 scopus 로고    scopus 로고
    • A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets
    • Fernández A, García S, del Jesus MJ, Herrera F (2008) A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst 159(18): 2378-2398.
    • (2008) Fuzzy Sets Syst , vol.159 , Issue.18 , pp. 2378-2398
    • Fernández, A.1    García, S.2    del Jesus, M.J.3    Herrera, F.4
  • 15
    • 19044382587 scopus 로고    scopus 로고
    • Round robin classification
    • Fürnkranz J (2002) Round robin classification. J Mach Learn Res 2: 721-747.
    • (2002) J Mach Learn Res , vol.2 , pp. 721-747
    • Fürnkranz, J.1
  • 16
    • 58149287952 scopus 로고    scopus 로고
    • An extension on "Statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons
    • García S, Herrera F (2008) An extension on "Statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. J Mach Learn Res 9: 2677-2694.
    • (2008) J Mach Learn Res , vol.9 , pp. 2677-2694
    • García, S.1    Herrera, F.2
  • 17
    • 70349617264 scopus 로고    scopus 로고
    • Evolutionary under-sampling for classification with imbalanced data sets: proposals and taxonomy
    • García S, Herrera F (2009) Evolutionary under-sampling for classification with imbalanced data sets: proposals and taxonomy. Evol Comput 17(3): 275-306.
    • (2009) Evol Comput , vol.17 , Issue.3 , pp. 275-306
    • García, S.1    Herrera, F.2
  • 18
    • 64549120231 scopus 로고    scopus 로고
    • A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability
    • García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10): 959-977.
    • (2009) Soft Comput , vol.13 , Issue.10 , pp. 959-977
    • García, S.1    Fernández, A.2    Luengo, J.3    Herrera, F.4
  • 19
    • 77549084648 scopus 로고    scopus 로고
    • Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power
    • García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180: 2044-2064.
    • (2010) Inf Sci , vol.180 , pp. 2044-2064
    • García, S.1    Fernández, A.2    Luengo, J.3    Herrera, F.4
  • 20
    • 0035254283 scopus 로고    scopus 로고
    • Rough sets theory for multicriteria decision analysis
    • Greco S (2001) Rough sets theory for multicriteria decision analysis. Eur J Oper Res 129: 1-47.
    • (2001) Eur J Oper Res , vol.129 , pp. 1-47
    • Greco, S.1
  • 21
    • 27744463461 scopus 로고    scopus 로고
    • A comparison of two approaches to data mining from imbalanced data
    • Grzymala-Busse JW, Stefanowski J, Wilk S (2005) A comparison of two approaches to data mining from imbalanced data. J Intell Manuf 16(6): 565-573.
    • (2005) J Intell Manuf , vol.16 , Issue.6 , pp. 565-573
    • Grzymala-Busse, J.W.1    Stefanowski, J.2    Wilk, S.3
  • 23
    • 68549133155 scopus 로고    scopus 로고
    • Learning from imbalanced data
    • He H, García EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9): 1263-1284.
    • (2009) IEEE Trans Knowl Data Eng , vol.21 , Issue.9 , pp. 1263-1284
    • He, H.1    García, E.A.2
  • 24
    • 0002294347 scopus 로고
    • A simple sequentially rejective multiple test procedure, Scandinavian
    • Holm S (1979) A simple sequentially rejective multiple test procedure, Scandinavian. J Stat 6: 65-70.
    • (1979) J Stat , vol.6 , pp. 65-70
    • Holm, S.1
  • 25
    • 14644390912 scopus 로고    scopus 로고
    • Using AUC and accuracy in evaluating learning algorithms
    • Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3): 299-310.
    • (2005) IEEE Trans Knowl Data Eng , vol.17 , Issue.3 , pp. 299-310
    • Huang, J.1    Ling, C.X.2
  • 26
    • 33646142788 scopus 로고    scopus 로고
    • Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem
    • Huan YM, Hung CM, Jiau HC (2006) Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem. Nonlinear Anal Real World Appl 7(4): 720-747.
    • (2006) Nonlinear Anal Real World Appl , vol.7 , Issue.4 , pp. 720-747
    • Huan, Y.M.1    Hung, C.M.2    Jiau, H.C.3
  • 27
    • 0001750957 scopus 로고
    • Approximations of the critical region of the Friedman statistic
    • Iman R, Davenport J (1980) Approximations of the critical region of the Friedman statistic. Commun Stat Part A Theory Methods 9: 571-595.
    • (1980) Commun Stat Part A Theory Methods , vol.9 , pp. 571-595
    • Iman, R.1    Davenport, J.2
  • 28
    • 33746336969 scopus 로고    scopus 로고
    • Test strategies for cost-sensitive decision trees
    • Ling C, Sheng V (2006) Test strategies for cost-sensitive decision trees. IEEE Trans Knowl Data Eng 18(8): 1055-1057.
    • (2006) IEEE Trans Knowl Data Eng , vol.18 , Issue.8 , pp. 1055-1057
    • Ling, C.1    Sheng, V.2
  • 29
    • 40649126091 scopus 로고    scopus 로고
    • Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance
    • Mazurowski M, Habas P, Zurada J, Lo J, Baker J, Tourassi G (2008) Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw 21(2-3): 427-436.
    • (2008) Neural Netw , vol.21 , Issue.2-3 , pp. 427-436
    • Mazurowski, M.1    Habas, P.2    Zurada, J.3    Lo, J.4    Baker, J.5    Tourassi, G.6
  • 31
    • 55549116330 scopus 로고    scopus 로고
    • Evolutionary rule-based systems for imbalanced datasets
    • Orriols-Puig A, Bernadó-Mansilla E (2009) Evolutionary rule-based systems for imbalanced datasets. Soft Comput 13(3): 213-225.
    • (2009) Soft Comput , vol.13 , Issue.3 , pp. 213-225
    • Orriols-Puig, A.1    Bernadó-Mansilla, E.2
  • 35
  • 36
    • 34547673383 scopus 로고    scopus 로고
    • Cost-sensitive boosting for classification of imbalanced data
    • Sun Y, Kamel MS, Wong AK, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit 40: 3358-3378.
    • (2007) Pattern Recognit , vol.40 , pp. 3358-3378
    • Sun, Y.1    Kamel, M.S.2    Wong, A.K.3    Wang, Y.4
  • 38
    • 41749093196 scopus 로고    scopus 로고
    • Risk-sensitive loss functions for sparse multi-category classification problems
    • Suresh S, Sundararajan N, Saratchandran P (2008) Risk-sensitive loss functions for sparse multi-category classification problems. Inf Sci 178(12): 2621-2638.
    • (2008) Inf Sci , vol.178 , Issue.12 , pp. 2621-2638
    • Suresh, S.1    Sundararajan, N.2    Saratchandran, P.3
  • 39
  • 40
    • 0037316513 scopus 로고    scopus 로고
    • Automated extraction of hierarchical decision rules from clinical databases using rough set model
    • Tsumoto S (2003) Automated extraction of hierarchical decision rules from clinical databases using rough set model. Expert Syst Appl 24: 189-197.
    • (2003) Expert Syst Appl , vol.24 , pp. 189-197
    • Tsumoto, S.1
  • 41
    • 77957583037 scopus 로고    scopus 로고
    • Boosting support vector machines for imbalanced data sets
    • Wang BX, Japkowicz N (2010) Boosting support vector machines for imbalanced data sets. Knowl Inf Syst 25(1): 1-20.
    • (2010) Knowl Inf Syst , vol.25 , Issue.1 , pp. 1-20
    • Wang, B.X.1    Japkowicz, N.2
  • 42
    • 37249058009 scopus 로고    scopus 로고
    • Attribute reduction in ordered information systems based on evidence theory
    • Wei-hua X, Xiao-yan Z, Jian-min Z, Wen-xiu Z (2008) Attribute reduction in ordered information systems based on evidence theory. Knowl Inf Syst 178(5): 1355-1371.
    • (2008) Knowl Inf Syst , vol.178 , Issue.5 , pp. 1355-1371
    • Wei-Hua, X.1    Xiao-Yan, Z.2    Jian-Min, Z.3    Wen-Xiu, Z.4
  • 44
    • 1442275185 scopus 로고    scopus 로고
    • Learning when training data are costly: the effect of class distribution on tree induction
    • Weiss GM, Provost F (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res 19: 315-354.
    • (2003) J Artif Intell Res , vol.19 , pp. 315-354
    • Weiss, G.M.1    Provost, F.2
  • 45
    • 0015361129 scopus 로고
    • Asymptotic properties of nearest neighbor rules using edited data
    • Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Commun 2(3): 408-421.
    • (1972) IEEE Trans Syst Man Commun , vol.2 , Issue.3 , pp. 408-421
    • Wilson, D.L.1
  • 47
    • 77957580823 scopus 로고    scopus 로고
    • Attribute reduction in ordered information systems based on evidence theory
    • Xu W, Zhang X, Zhong J, Zhang W (2010) Attribute reduction in ordered information systems based on evidence theory. Knowl Inf Syst 25(1): 169-184.
    • (2010) Knowl Inf Syst , vol.25 , Issue.1 , pp. 169-184
    • Xu, W.1    Zhang, X.2    Zhong, J.3    Zhang, W.4
  • 48
    • 33845501387 scopus 로고    scopus 로고
    • 10 challenging problems in data mining research
    • Yang Q, Wu X (2006) 10 challenging problems in data mining research. Int J Inf Technol Decis Mak 5(4): 597-604.
    • (2006) Int J Inf Technol Decis Mak , vol.5 , Issue.4 , pp. 597-604
    • Yang, Q.1    Wu, X.2
  • 49
    • 31344442851 scopus 로고    scopus 로고
    • Training cost-sensitive neural networks with methods addressing the class imbalance problem
    • Zhou Z-H, Liu X-Y (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1): 63-77.
    • (2006) IEEE Trans Knowl Data Eng , vol.18 , Issue.1 , pp. 63-77
    • Zhou, Z.-H.1    Liu, X.-Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.