메뉴 건너뛰기




Volumn 257, Issue , 2014, Pages 1-13

On the importance of the validation technique for classification with imbalanced datasets: Addressing covariate shift when data is skewed

Author keywords

Classification; Covariate shift; Dataset shift; Imbalanced dataset; Partitioning; Validation technique

Indexed keywords

COVARIATE SHIFTS; DATASET SHIFTS; IMBALANCED DATASET; PARTITIONING; VALIDATION TECHNIQUE;

EID: 84888645340     PISSN: 00200255     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ins.2013.09.038     Document Type: Article
Times cited : (137)

References (52)
  • 6
    • 27144531570 scopus 로고    scopus 로고
    • A study of the behaviour of several methods for balancing machine learning training data
    • G.E.A.P.A. Batista, R.C. Prati, and M.C. Monard A study of the behaviour of several methods for balancing machine learning training data SIGKDD Explorations 6 1 2004 20 29
    • (2004) SIGKDD Explorations , vol.6 , Issue.1 , pp. 20-29
    • Batista, G.E.A.P.A.1    Prati, R.C.2    Monard, M.C.3
  • 7
    • 0031191630 scopus 로고    scopus 로고
    • The use of the area under the ROC curve in the evaluation of machine learning algorithms
    • A.P. Bradley The use of the area under the ROC curve in the evaluation of machine learning algorithms Pattern Recognition 30 7 1997 1145 1159
    • (1997) Pattern Recognition , vol.30 , Issue.7 , pp. 1145-1159
    • Bradley, A.P.1
  • 9
    • 33845982223 scopus 로고    scopus 로고
    • Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability
    • J.R. Cano, F. Herrera, and M. Lozano Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability Data and Knowledge Engineering 60 2007 90 108
    • (2007) Data and Knowledge Engineering , vol.60 , pp. 90-108
    • Cano, J.R.1    Herrera, F.2    Lozano, M.3
  • 11
    • 27144549260 scopus 로고    scopus 로고
    • Editorial: Special issue on learning from imbalanced data sets
    • N.V. Chawla, N. Japkowicz, and A. Kotcz Editorial: special issue on learning from imbalanced data sets SIGKDD Explorations 6 1 2004 1 6
    • (2004) SIGKDD Explorations , vol.6 , Issue.1 , pp. 1-6
    • Chawla, N.V.1    Japkowicz, N.2    Kotcz, A.3
  • 12
    • 0346076782 scopus 로고    scopus 로고
    • Support vector learning for fuzzy rule-based classification systems
    • Y. Chen, and J. Wang Support vector learning for fuzzy rule-based classification systems IEEE Transactions on Fuzzy Systems 11 6 2003 716 728
    • (2003) IEEE Transactions on Fuzzy Systems , vol.11 , Issue.6 , pp. 716-728
    • Chen, Y.1    Wang, J.2
  • 15
    • 34249753618 scopus 로고
    • Support vector networks
    • C. Cortes, and V. Vapnik Support vector networks Machine Learning 20 1995 273 297
    • (1995) Machine Learning , vol.20 , pp. 273-297
    • Cortes, C.1    Vapnik, V.2
  • 17
    • 29644438050 scopus 로고    scopus 로고
    • Statistical comparisons of classifiers over multiple data sets
    • J. Demšar Statistical comparisons of classifiers over multiple data sets Journal of Machine Learning Research 7 2006 1 30
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1-30
    • Demšar, J.1
  • 19
    • 64549120231 scopus 로고    scopus 로고
    • A study of statistical techniques and performance measures for genetics-based machine learning: Accuracy and interpretability
    • S. García, A. Fernández, J. Luengo, and F. Herrera A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability Soft Computing 13 10 2009 959 977
    • (2009) Soft Computing , vol.13 , Issue.10 , pp. 959-977
    • García, S.1    Fernández, A.2    Luengo, J.3    Herrera, F.4
  • 20
    • 58149287952 scopus 로고    scopus 로고
    • An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons
    • S. García, and F. Herrera An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons Journal of Machine Learning Research 9 2008 2607 2624
    • (2008) Journal of Machine Learning Research , vol.9 , pp. 2607-2624
    • García, S.1    Herrera, F.2
  • 23
    • 33646142788 scopus 로고    scopus 로고
    • Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem
    • Y.-M. Huang, C.-M. Hung, and H.C. Jiau Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem Nonlinear Analysis: Real World Applications 7 4 2006 720 747
    • (2006) Nonlinear Analysis: Real World Applications , vol.7 , Issue.4 , pp. 720-747
    • Huang, Y.-M.1    Hung, C.-M.2    Jiau, H.C.3
  • 25
    • 26844469668 scopus 로고    scopus 로고
    • Rule weight specification in fuzzy rule-based classification systems
    • H. Ishibuchi, and T. Yamamoto Rule weight specification in fuzzy rule-based classification systems IEEE Transactions on Fuzzy Systems 13 2005 428 435
    • (2005) IEEE Transactions on Fuzzy Systems , vol.13 , pp. 428-435
    • Ishibuchi, H.1    Yamamoto, T.2
  • 27
    • 85166317163 scopus 로고    scopus 로고
    • Approaches to online learning and concept drift for user identification in computer security
    • T. Lane, C.E. Brodley, Approaches to online learning and concept drift for user identification in computer security, in: KDD, 1998.
    • (1998) KDD
    • Lane, T.1    Brodley, C.E.2
  • 28
    • 84856964446 scopus 로고    scopus 로고
    • Analysis of preprocessing vs. Cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics
    • V. López, A. Fernández, J.G. Moreno-Torres, and F. Herrera Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics Expert Systems with Applications 39 7 2012 6585 6608
    • (2012) Expert Systems with Applications , vol.39 , Issue.7 , pp. 6585-6608
    • López, V.1    Fernández, A.2    Moreno-Torres, J.G.3    Herrera, F.4
  • 32
    • 55549116330 scopus 로고    scopus 로고
    • Evolutionary rule-based systems for imbalanced datasets
    • A. Orriols-Puig, and E. Bernadó-Mansilla Evolutionary rule-based systems for imbalanced datasets Soft Computing 13 3 2009 213 225
    • (2009) Soft Computing , vol.13 , Issue.3 , pp. 213-225
    • Orriols-Puig, A.1    Bernadó-Mansilla, E.2
  • 36
    • 0037527188 scopus 로고    scopus 로고
    • Improving predictive inference under covariate shift by weighting the log-likelihood function
    • H. Shimodaira Improving predictive inference under covariate shift by weighting the log-likelihood function Journal of Statistical Planning and Inference 90 2 2000 227 244
    • (2000) Journal of Statistical Planning and Inference , vol.90 , Issue.2 , pp. 227-244
    • Shimodaira, H.1
  • 37
    • 80052700798 scopus 로고    scopus 로고
    • When training and test sets are different: Characterizing learning transfer
    • J.Q. Candela, M. Sugiyama, A. Schwaighofer, N.D. Lawrence, MIT Press
    • A. Storkey When training and test sets are different: characterizing learning transfer J.Q. Candela, M. Sugiyama, A. Schwaighofer, N.D. Lawrence, Dataset Shift in Machine Learning 2009 MIT Press 3 28
    • (2009) Dataset Shift in Machine Learning , pp. 3-28
    • Storkey, A.1
  • 40
    • 0036565589 scopus 로고    scopus 로고
    • An instance-weighting method to induce cost-sensitive trees
    • K.M. Ting An instance-weighting method to induce cost-sensitive trees IEEE Transactions on Knowledge and Data Engineering 14 3 2002 659 665
    • (2002) IEEE Transactions on Knowledge and Data Engineering , vol.14 , Issue.3 , pp. 659-665
    • Ting, K.M.1
  • 41
    • 77956023732 scopus 로고    scopus 로고
    • Combating the small sample class imbalance problem using feature selection
    • M. Wasikowski, and X.-W. Chen Combating the small sample class imbalance problem using feature selection IEEE Transactions on Knowledge and Data Engineering 22 10 2010 1388 1400
    • (2010) IEEE Transactions on Knowledge and Data Engineering , vol.22 , Issue.10 , pp. 1388-1400
    • Wasikowski, M.1    Chen, X.-W.2
  • 42
    • 20844458491 scopus 로고    scopus 로고
    • Mining with rarity: A unifying framework
    • G.M. Weiss Mining with rarity: a unifying framework SIGKDD Explorations 6 1 2004 7 19
    • (2004) SIGKDD Explorations , vol.6 , Issue.1 , pp. 7-19
    • Weiss, G.M.1
  • 43
    • 1442275185 scopus 로고    scopus 로고
    • Learning when training data are costly: The effect of class distribution on tree induction
    • G.M. Weiss, and F.J. Provost Learning when training data are costly: the effect of class distribution on tree induction Journal of Artificial Intelligence Research 19 2003 315 354
    • (2003) Journal of Artificial Intelligence Research , vol.19 , pp. 315-354
    • Weiss, G.M.1    Provost, F.J.2
  • 44
    • 50549089793 scopus 로고    scopus 로고
    • Maximizing classifier utility when there are data acquisition and modeling costs
    • G.M. Weiss, and Y. Tian Maximizing classifier utility when there are data acquisition and modeling costs Data Mining and Knowledge Discovery 17 2 2008 253 282
    • (2008) Data Mining and Knowledge Discovery , vol.17 , Issue.2 , pp. 253-282
    • Weiss, G.M.1    Tian, Y.2
  • 45
    • 0030126609 scopus 로고    scopus 로고
    • Learning in the presence of concept drift and hidden contexts
    • G. Widmer, and M. Kubat Learning in the presence of concept drift and hidden contexts Machine Learning 23 1 1996 69 101
    • (1996) Machine Learning , vol.23 , Issue.1 , pp. 69-101
    • Widmer, G.1    Kubat, M.2
  • 47
    • 34547980509 scopus 로고    scopus 로고
    • Asymptotic bayesian generalization error when training and test distributions are different
    • Z. Ghahramani, ACM International Conference Proceeding Series ACM
    • K. Yamazaki, M. Kawanabe, S. Watanabe, M. Sugiyama, and K.-R. Müller Asymptotic bayesian generalization error when training and test distributions are different Z. Ghahramani, ICML ACM International Conference Proceeding Series vol. 227 2007 ACM
    • (2007) ICML , vol.227 VOL.
    • Yamazaki, K.1    Kawanabe, M.2    Watanabe, S.3    Sugiyama, M.4    Müller, K.-R.5
  • 51
    • 0004232308 scopus 로고    scopus 로고
    • Prentice Hall Upper Saddle River, New Jersey
    • J.H. Zar Biostatistical Analysis 1999 Prentice Hall Upper Saddle River, New Jersey
    • (1999) Biostatistical Analysis
    • Zar, J.H.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.