메뉴 건너뛰기




Volumn 45, Issue 1, 2015, Pages 247-270

Class imbalance revisited: a new experimental setup to assess the performance of treatment methods

Author keywords

Class imbalance; Experimental setup; Sampling methods

Indexed keywords

CLASSIFICATION (OF INFORMATION); DATA MINING; LEARNING ALGORITHMS; PATTERN RECOGNITION; RECOVERY; STATISTICS; SUPPORT VECTOR MACHINES;

EID: 84942249246     PISSN: 02191377     EISSN: 02193116     Source Type: Journal    
DOI: 10.1007/s10115-014-0794-3     Document Type: Article
Times cited : (164)

References (37)
  • 1
    • 27144531570 scopus 로고    scopus 로고
    • A study of the behavior of several methods for balancing machine learning training data
    • Batista GEAPA, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor 6(1):20–29
    • (2004) SIGKDD Explor , vol.6 , Issue.1 , pp. 20-29
    • Batista, G.E.A.P.A.1    Prati, R.C.2    Monard, M.C.3
  • 2
    • 84885862739 scopus 로고
    • Confidence limits for a ratio using Wilcoxon’s signed rank test
    • Bennett BM (1965) Confidence limits for a ratio using Wilcoxon’s signed rank test. Biometics 21(1):231–234
    • (1965) Biometics , vol.21 , Issue.1 , pp. 231-234
    • Bennett, B.M.1
  • 3
    • 84877652134 scopus 로고    scopus 로고
    • Significance tests or confidence intervals: which are preferable for the comparison of classifiers?
    • Berrar D, Lozano JA (2013) Significance tests or confidence intervals: which are preferable for the comparison of classifiers?. J Exp Theor Artif Intell 25(2):189–206. http://www.ingentaconnect.com/content/tandf/teta/2013/00000025/00000002/art00003
    • (2013) J Exp Theor Artif Intell , vol.25 , Issue.2 , pp. 189-206
    • Berrar, D.1    Lozano, J.A.2
  • 7
    • 44649133282 scopus 로고    scopus 로고
    • Analyzing pets on imbalanced datasets when training and testing class distributions differ. In: Pacific-Asia conference on advances in knowledge discovery and data mining
    • Cieslak D, Chawla N (2008) Analyzing pets on imbalanced datasets when training and testing class distributions differ. In: Pacific-Asia conference on advances in knowledge discovery and data mining, pp 519–526
    • (2008) pp 519–526
    • Cieslak, D.1    Chawla, N.2
  • 8
    • 85015191605 scopus 로고
    • Rule induction with CN2: some recent improvements. In: European working session on machine learning
    • Clark P, Boswell R (1991) Rule induction with CN2: some recent improvements. In: European working session on machine learning, pp 151–163
    • (1991) pp 151–163
    • Clark, P.1    Boswell, R.2
  • 9
    • 33646107181 scopus 로고    scopus 로고
    • Learning from imbalanced data in surveillance of nosocomial infection
    • Cohen G, Hilario M, Sax H, Hugonnet S, Geissbhler A (2006) Learning from imbalanced data in surveillance of nosocomial infection. Artif Intell Med 37(1):7–18
    • (2006) Artif Intell Med , vol.37 , Issue.1 , pp. 7-18
    • Cohen, G.1    Hilario, M.2    Sax, H.3    Hugonnet, S.4    Geissbhler, A.5
  • 12
    • 33646023117 scopus 로고    scopus 로고
    • An introduction to ROC analysis
    • Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874
    • (2006) Pattern Recognit Lett , vol.27 , Issue.8 , pp. 861-874
    • Fawcett, T.1
  • 13
    • 67349093551 scopus 로고    scopus 로고
    • Classification accuracy comparison: Hypothesis tests and the use of confidence intervals in evaluations of difference, equivalence and non-inferiority
    • Foody GM (2009) Classification accuracy comparison: Hypothesis tests and the use of confidence intervals in evaluations of difference, equivalence and non-inferiority. Remote Sens Environ 113(8):1658–1663. http://www.sciencedirect.com/science/article/pii/S0034425709000923
    • (2009) Remote Sens Environ , vol.113 , Issue.8 , pp. 1658-1663
    • Foody, G.M.1
  • 14
    • 84942281752 scopus 로고    scopus 로고
    • Frank A, Asuncion A (2010) UCI machine learning repository.
    • Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
  • 15
    • 84942281753 scopus 로고    scopus 로고
    • Confidence intervals for the ratio of locations and for the ratio of scales of two paired samples
    • Technical report, The Comprehensive R Archive Network
    • Froemke C, Hothorn L, Schneider M (2012) Confidence intervals for the ratio of locations and for the ratio of scales of two paired samples. Technical report, The Comprehensive R Archive Network. http://cran.r-project.org/web/packages/pairedCI/index.html
    • (2012)
    • Froemke, C.1    Hothorn, L.2    Schneider, M.3
  • 16
    • 84862515469 scopus 로고    scopus 로고
    • A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches
    • Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C 42(4):463–484
    • (2012) IEEE Trans Syst Man Cybern Part C , vol.42 , Issue.4 , pp. 463-484
    • Galar, M.1    Fernandez, A.2    Barrenechea, E.3    Bustince, H.4    Herrera, F.5
  • 17
    • 27144479454 scopus 로고    scopus 로고
    • Learning from imbalanced data sets with boosting and data generation: the databoost-im approach
    • Guo H, Viktor HL (2004) Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explor 6(1):30–39
    • (2004) SIGKDD Explor , vol.6 , Issue.1 , pp. 30-39
    • Guo, H.1    Viktor, H.L.2
  • 18
    • 27144501672 scopus 로고    scopus 로고
    • Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: International conference on advances in intelligent computing. Lecture notes in computer science. Springer, Berlin, pp 878–887
    • Han H, Wang W-Y, Mao B-H (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: International conference on advances in intelligent computing. Lecture notes in computer science. Springer, Berlin, pp 878–887. doi:10.1007/11538059_91
    • (2005) doi:10.1007/11538059_91
    • Han, H.1    Wang, W.-Y.2    Mao, B.-H.3
  • 19
    • 56349089205 scopus 로고    scopus 로고
    • Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: IEEE international joint conference on neural networks
    • He H, Bai Y, Garcia E, Li S (2008) Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: IEEE international joint conference on neural networks, pp 1322–1328
    • (2008) pp 1322–1328
    • He, H.1    Bai, Y.2    Garcia, E.3    Li, S.4
  • 20
    • 68549133155 scopus 로고    scopus 로고
    • Learning from imbalanced data
    • He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
    • (2009) IEEE Trans Knowl Data Eng , vol.21 , Issue.9 , pp. 1263-1284
    • He, H.1    Garcia, E.A.2
  • 21
    • 33845536164 scopus 로고    scopus 로고
    • The class imbalance problem: a systematic study
    • Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449
    • (2002) Intell Data Anal , vol.6 , Issue.5 , pp. 429-449
    • Japkowicz, N.1    Stephen, S.2
  • 22
    • 47349098911 scopus 로고    scopus 로고
    • Learning with limited minority class data. In: International conference on machine learning and applications
    • Khoshgoftaar TM, Seiffert C, Hulse JV, Napolitano A, Folleco A (2007) Learning with limited minority class data. In: International conference on machine learning and applications, pp 348–353
    • (2007) pp 348–353
    • Khoshgoftaar, T.M.1    Seiffert, C.2    Hulse, J.V.3    Napolitano, A.4    Folleco, A.5
  • 23
    • 0031998121 scopus 로고    scopus 로고
    • Machine learning for the detection of oil spills in satellite radar images
    • Kubat M, Holte RC, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2–3):195–215
    • (1998) Mach Learn , vol.30 , Issue.2-3 , pp. 195-215
    • Kubat, M.1    Holte, R.C.2    Matwin, S.3
  • 24
    • 84878083672 scopus 로고    scopus 로고
    • Exploratory under-sampling for class-imbalance learning. In: IEEE international conference on data mining
    • Liu X-Y, Wu J, Zhou Z-H (2006) Exploratory under-sampling for class-imbalance learning. In: IEEE international conference on data mining, pp 965–969
    • (2006) pp 965–969
    • Liu, X.-Y.1    Wu, J.2    Zhou, Z.-H.3
  • 25
    • 84878098426 scopus 로고    scopus 로고
    • The influence of class imbalance on cost-sensitive learning: an empirical study. In: ‘ICDM’, IEEE Computer Society
    • Liu X-Y, Zhou Z-H (2006) The influence of class imbalance on cost-sensitive learning: an empirical study. In: ‘ICDM’, IEEE Computer Society, pp 970–974
    • (2006) pp 970–974
    • Liu, X.-Y.1    Zhou, Z.-H.2
  • 27
    • 29144443664 scopus 로고    scopus 로고
    • Minority report in fraud detection: classification of skewed data
    • Phua C, Alahakoon D, Lee V (2004) Minority report in fraud detection: classification of skewed data. SIGKDD Explor 6(1):50–59
    • (2004) SIGKDD Explor , vol.6 , Issue.1 , pp. 50-59
    • Phua, C.1    Alahakoon, D.2    Lee, V.3
  • 28
    • 80053222008 scopus 로고    scopus 로고
    • A survey on graphical methods for classification predictive performance evaluation
    • Prati RC, Batista GEAPA, Monard MC (2011) A survey on graphical methods for classification predictive performance evaluation. IEEE Trans Knowl Data Eng 23(11):1601–1618
    • (2011) IEEE Trans Knowl Data Eng , vol.23 , Issue.11 , pp. 1601-1618
    • Prati, R.C.1    Batista, G.E.A.P.A.2    Monard, M.C.3
  • 29
    • 84942281760 scopus 로고    scopus 로고
    • Silva DF, Paper website
    • Prati RC, Batista GEAPA, Silva DF (2013) Paper website. http://sites.labic.icmc.usp.br/ClassImbalanceRevisited/
    • (2013) Batista GEAPA
    • Prati, R.C.1
  • 30
    • 0002900357 scopus 로고    scopus 로고
    • The case against accuracy estimation for comparing induction algorithms
    • Shavlik JW, (ed), Morgan Kaufmann, Los Altos, CA
    • Provost FJ, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing induction algorithms. In: Shavlik JW (ed) International conference on machine learning. Morgan Kaufmann, Los Altos, CA, pp 445–453
    • (1998) International conference on machine learning , pp. 445-453
    • Provost, F.J.1    Fawcett, T.2    Kohavi, R.3
  • 31
    • 0003500248 scopus 로고
    • C4.5: programs for machine learning
    • Morgan Kaufmann Publishers Inc, Los Altos, CA
    • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc, Los Altos, CA
    • (1993)
    • Quinlan, J.R.1
  • 32
    • 84857180411 scopus 로고    scopus 로고
    • Class imbalance, redux. In: IEEE international conference on data mining
    • Wallace B, Small K, Brodley C, Trikalinos T (2011) Class imbalance, redux. In: IEEE international conference on data mining, pp 754–763
    • (2011) pp 754–763
    • Wallace, B.1    Small, K.2    Brodley, C.3    Trikalinos, T.4
  • 33
    • 84884493455 scopus 로고    scopus 로고
    • Cost-sensitive boosting algorithms for imbalanced multi-instance datasets
    • Springer, Berlin
    • Wang X, Matwin S, Japkowicz N, Liu X (2013) Cost-sensitive boosting algorithms for imbalanced multi-instance datasets. In: Zaïane OR, Zilles S (eds) Canadian conference on artificial intelligence, vol 7884 of lecture notes in computer science. Springer, Berlin, pp 174–186
    • (2013) of lecture notes in computer science , vol.7884 , pp. 174-186
    • Wang, X.1    Matwin, S.2    Japkowicz, N.3    Liu, X.4    Zaïane, O.R.5    Zilles, S.6
  • 34
    • 20844458491 scopus 로고    scopus 로고
    • Mining with rarity: a unifying framework
    • Weiss GM (2004) Mining with rarity: a unifying framework. SIGKDD Explor 6(1):7–19
    • (2004) SIGKDD Explor , vol.6 , Issue.1 , pp. 7-19
    • Weiss, G.M.1
  • 35
    • 84942281762 scopus 로고    scopus 로고
    • Weiss GM, McCarthy K, Zabar B (2007) Cost-sensitive learning vs. sampling: which is best for handling unbalanced classes with unequal error costs? In: IEEE international conference on data mining, pp 35–41
    • Weiss GM, McCarthy K, Zabar B (2007) Cost-sensitive learning vs. sampling: which is best for handling unbalanced classes with unequal error costs? In: IEEE international conference on data mining, pp 35–41
  • 36
    • 1442275185 scopus 로고    scopus 로고
    • Learning when training data are costly: the effect of class distribution on tree induction
    • Weiss GM, Provost F (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res 19:315–354
    • (2003) J Artif Intell Res , vol.19 , pp. 315-354
    • Weiss, G.M.1    Provost, F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.