메뉴 건너뛰기




Volumn 24, Issue 1, 2012, Pages 136-158

Hellinger distance decision trees are robust and skew-insensitive

Author keywords

Decision tree; Hellinger distance; Imbalanced data

Indexed keywords

COMMON PROBLEMS; DATA SETS; DECISION TREE TECHNIQUES; EMPIRICAL EVALUATIONS; ENSEMBLE METHODS; GAIN RATIO; HELLINGER DISTANCE; IMBALANCED DATA; PARAMETER SELECTION; ROBUST TESTS; SAMPLING METHOD; SAMPLING TECHNIQUE; SPLITTING CRITERION; STATISTICAL SIGNIFICANCE;

EID: 84856621489     PISSN: 13845810     EISSN: None     Source Type: Journal    
DOI: 10.1007/s10618-011-0222-1     Document Type: Article
Times cited : (194)

References (41)
  • 1
    • 0033570831 scopus 로고    scopus 로고
    • Combined 5 × 2cv F test for comparing supervised classification learning algorithms
    • 10.1162/089976699300016007
    • E Alpaydin 1999 Combined 5 × 2cv F test for comparing supervised classification learning algorithms Neural Comput 11 8 1885 1892 10.1162/089976699300016007
    • (1999) Neural Comput , vol.11 , Issue.8 , pp. 1885-1892
    • Alpaydin, E.1
  • 4
    • 27144531570 scopus 로고    scopus 로고
    • A study of the behavior of several methods for balancing machine learning training data
    • 10.1145/1007730.1007735
    • G Batista R Prati M Monard 2004 A study of the behavior of several methods for balancing machine learning training data SIGKDD Explor 6 1 20 29 10.1145/1007730.1007735
    • (2004) SIGKDD Explor , vol.6 , Issue.1 , pp. 20-29
    • Batista, G.1    Prati, R.2    Monard, M.3
  • 5
    • 0030211964 scopus 로고    scopus 로고
    • Bagging predictors
    • 0858.68080 1425957
    • L Breiman 1996 Bagging predictors Mach Learn 24 2 123 140 0858.68080 1425957
    • (1996) Mach Learn , vol.24 , Issue.2 , pp. 123-140
    • Breiman, L.1
  • 6
    • 33947238038 scopus 로고    scopus 로고
    • Rejoinder to the paper 'Arcing Classifiers' by Leo Breiman
    • 1635406
    • L Breiman 1998 Rejoinder to the paper 'Arcing Classifiers' by Leo Breiman Ann Stat 26 2 841 849 1635406
    • (1998) Ann Stat , vol.26 , Issue.2 , pp. 841-849
    • Breiman, L.1
  • 7
    • 0035478854 scopus 로고    scopus 로고
    • Random forests
    • 10.1023/A:1010933404324 1007.68152
    • L Breiman 2001 Random forests Mach Learn 45 1 5 32 10.1023/A: 1010933404324 1007.68152
    • (2001) Mach Learn , vol.45 , Issue.1 , pp. 5-32
    • Breiman, L.1
  • 10
    • 68549121111 scopus 로고    scopus 로고
    • C4.5 and imbalanced data sets: Investigating the effect of sampling method, probabilistic estimate, and decision tree structure
    • Washington, DC, USA
    • Chawla NV (2003) C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: ICML workshop on learning from imbalanced data sets II. Washington, DC, USA, pp 1-8
    • (2003) ICML Workshop on Learning from Imbalanced Data Sets II , pp. 1-8
    • Chawla, N.V.1
  • 12
    • 27144549260 scopus 로고    scopus 로고
    • Editorial: Learning from imbalanced datasets
    • 10.1145/1007730.1007733
    • NV Chawla N Japkowicz A Kolcz 2004 Editorial: learning from imbalanced datasets SIGKDD Explor 6 1 16 10.1145/1007730.1007733
    • (2004) SIGKDD Explor , vol.6 , pp. 1-16
    • Chawla, N.V.1    Japkowicz, N.2    Kolcz, A.3
  • 13
    • 50549101751 scopus 로고    scopus 로고
    • Automatically countering imbalance and its empirical relationship to cost
    • 10.1007/s10618-008-0087-0 2434765
    • NV Chawla DA Cieslak LO Hall A Joshi 2008 Automatically countering imbalance and its empirical relationship to cost Data Min Knowl Discov 17 2 225 252 10.1007/s10618-008-0087-0 2434765
    • (2008) Data Min Knowl Discov , vol.17 , Issue.2 , pp. 225-252
    • Chawla, N.V.1    Cieslak, D.A.2    Hall, L.O.3    Joshi, A.4
  • 15
    • 44649133282 scopus 로고    scopus 로고
    • Analyzing classifier performance on imbalanced datasets when training and testing distributions differ
    • Osaka, Japan
    • Cieslak DA, Chawla NV (2008b) Analyzing classifier performance on imbalanced datasets when training and testing distributions differ. In: Pacific-Asia conference on knowledge discovery and data mining (PAKDD). Osaka, Japan, pp 519-526
    • (2008) Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) , pp. 519-526
    • Cieslak, D.A.1    Chawla, N.V.2
  • 16
    • 29644438050 scopus 로고    scopus 로고
    • Statistical comparisons of classifiers over multiple data sets
    • 1222.68184 2274360
    • J Demšar 2006 Statistical comparisons of classifiers over multiple data sets J Mach Learn Res 7 1 30 1222.68184 2274360
    • (2006) J Mach Learn Res , vol.7 , pp. 1-30
    • Demšar, J.1
  • 17
    • 0000259511 scopus 로고    scopus 로고
    • Approximate statistical tests for comparing supervised classiffication learning algorithms
    • 10.1162/089976698300017197
    • TG Dietterich 1998 Approximate statistical tests for comparing supervised classiffication learning algorithms Neural Comput 10 7 1895 1923 10.1162/089976698300017197
    • (1998) Neural Comput , vol.10 , Issue.7 , pp. 1895-1923
    • Dietterich, T.G.1
  • 18
    • 0034250160 scopus 로고    scopus 로고
    • An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization
    • 10.1023/A:1007607513941
    • T Dietterich 2000 An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization Mach Learn 40 2 139 157 10.1023/A:1007607513941
    • (2000) Mach Learn , vol.40 , Issue.2 , pp. 139-157
    • Dietterich, T.1
  • 20
    • 0004708854 scopus 로고    scopus 로고
    • Exploiting the cost (in)sensitivity of decision tree splitting criteria
    • Stanford University, California, USA
    • Drummond C, Holte R (2000) Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: International conference on machine learning (ICML). Stanford University, California, USA, pp 239-246
    • (2000) International Conference on Machine Learning (ICML) , pp. 239-246
    • Drummond, C.1    Holte, R.2
  • 21
    • 34547993162 scopus 로고    scopus 로고
    • C4.5, Class imbalance, and cost sensitivity: Why under-sampling beats over-sampling
    • Washington, DC, USA
    • Drummond C, Holte R (2003) C4.5, Class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: ICML workshop on learning from imbalanced datasets II. Washington, DC, USA, pp 1-8
    • (2003) ICML Workshop on Learning from Imbalanced Datasets II , pp. 1-8
    • Drummond, C.1    Holte, R.2
  • 22
    • 1942421135 scopus 로고    scopus 로고
    • The geometry of ROC space: Understanding machine learning metrics through ROC isometrics
    • Washington, DC, USA
    • Flach PA (2003) The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In: International conference on machine learning (ICML). Washington, DC, USA, pp 194-201
    • (2003) International Conference on Machine Learning (ICML) , pp. 194-201
    • Flach, P.A.1
  • 24
    • 0001837148 scopus 로고
    • A comparison of alternative tests of significance for the problem of m rankings
    • 10.1214/aoms/1177731944 0063.01455
    • M Friedman 1940 A comparison of alternative tests of significance for the problem of m rankings Ann Math Stat 11 1 86 92 10.1214/aoms/1177731944 0063.01455
    • (1940) Ann Math Stat , vol.11 , Issue.1 , pp. 86-92
    • Friedman, M.1
  • 25
    • 0004207439 scopus 로고
    • Van Nostrand and Co. Princeton 0040.16802
    • Halmos P (1950) Measure theory. Van Nostrand and Co., Princeton
    • (1950) Measure Theory
    • Halmos, P.1
  • 26
    • 0003562954 scopus 로고    scopus 로고
    • A simple generalisation of the area under the ROC curve for multiple class classification problems
    • 10.1023/A:1010920819831 1007.68180
    • D Hand R Till 2001 A simple generalisation of the area under the ROC curve for multiple class classification problems Mach Learn 45 171 186 10.1023/A:1010920819831 1007.68180
    • (2001) Mach Learn , vol.45 , pp. 171-186
    • Hand, D.1    Till, R.2
  • 28
    • 0032139235 scopus 로고    scopus 로고
    • The random subspace method for constructing decision forests
    • 10.1109/34.709601
    • T Ho 1998 The random subspace method for constructing decision forests IEEE Trans Pattern Anal Mach Intell 20 8 832 844 10.1109/34.709601
    • (1998) IEEE Trans Pattern Anal Mach Intell , vol.20 , Issue.8 , pp. 832-844
    • Ho, T.1
  • 29
    • 0002294347 scopus 로고
    • A simple sequentially rejective multiple test procedure
    • 0402.62058 538597
    • S Holm 1979 A simple sequentially rejective multiple test procedure Scand J Stat 6 2 65 70 0402.62058 538597
    • (1979) Scand J Stat , vol.6 , Issue.2 , pp. 65-70
    • Holm, S.1
  • 31
    • 65249157560 scopus 로고
    • The divergence and Bhattacharyya distance measures in signal selection
    • 10.1109/TCOM.1967.1089532
    • T Kailath 1967 The divergence and Bhattacharyya distance measures in signal selection IEEE Trans Commun 15 1 52 60 10.1109/TCOM.1967.1089532
    • (1967) IEEE Trans Commun , vol.15 , Issue.1 , pp. 52-60
    • Kailath, T.1
  • 32
    • 0001972236 scopus 로고    scopus 로고
    • Addressing the curse of imbalanced training sets: One-sided selection
    • Nashville, Tennessee, USA
    • Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: International conference on machine learning (ICML). Nashville, Tennessee, USA, pp 179-186
    • (1997) International Conference on Machine Learning (ICML) , pp. 179-186
    • Kubat, M.1    Matwin, S.2
  • 33
    • 85045184435 scopus 로고    scopus 로고
    • Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization
    • Vancouver, BC, Canada
    • Nguyen X, Wainwright MJ, Jordan MI (2007) Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization. In: Advances in neural information processing systems (NIPS). Vancouver, BC, Canada, pp 1-8
    • (2007) Advances in Neural Information Processing Systems (NIPS) , pp. 1-8
    • Nguyen, X.1    Wainwright, M.J.2    Jordan, M.I.3
  • 34
    • 0042346121 scopus 로고    scopus 로고
    • Tree induction for probability-based ranking
    • 10.1023/A:1024099825458 1039.68105
    • F Provost P Domingos 2003 Tree induction for probability-based ranking Mach Learn 52 3 199 215 10.1023/A:1024099825458 1039.68105
    • (2003) Mach Learn , vol.52 , Issue.3 , pp. 199-215
    • Provost, F.1    Domingos, P.2
  • 35
    • 33744584654 scopus 로고
    • Induction of decision trees
    • R Quinlan 1986 Induction of decision trees Mach Learn 1 81 106
    • (1986) Mach Learn , vol.1 , pp. 81-106
    • Quinlan, R.1
  • 36
    • 0002618996 scopus 로고
    • A review of canonical coordinates and an alternative to correspondence analysis using Hellinger distance
    • 1167.62421
    • C Rao 1995 A review of canonical coordinates and an alternative to correspondence analysis using Hellinger distance Questiio (Quaderns d'Estadistica i Investig Oper) 19 23 63 1167.62421
    • (1995) Questiio (Quaderns d'Estadistica i Investig Oper) , vol.19 , pp. 23-63
    • Rao, C.1
  • 37
    • 0033281701 scopus 로고    scopus 로고
    • Improved boosting algorithms using confidence-rated predictions
    • 10.1023/A:1007614523901 0945.68194
    • R Schapire Y Singer 1999 Improved boosting algorithms using confidence-rated predictions Mach Learn 37 297 336 10.1023/A:1007614523901 0945.68194
    • (1999) Mach Learn , vol.37 , pp. 297-336
    • Schapire, R.1    Singer, Y.2
  • 39
    • 0345045736 scopus 로고    scopus 로고
    • A quantification of distance-bias between evaluation metrics in classification
    • Stanford University, California, USA
    • Vilalta R, Oblinger D (2000) A quantification of distance-bias between evaluation metrics in classification. In: International conference on machine learning (ICML). Stanford University, California, USA, pp 1087-1094
    • (2000) International Conference on Machine Learning (ICML) , pp. 1087-1094
    • Vilalta, R.1    Oblinger, D.2
  • 40
    • 77649273505 scopus 로고    scopus 로고
    • Cog: Local decomposition for rare class analysis
    • 10.1007/s10618-009-0146-1 2596456
    • J Wu H Xiong J Chen 2010 Cog: local decomposition for rare class analysis Data Min Knowl Discov 20 191 220 10.1007/s10618-009-0146-1 2596456
    • (2010) Data Min Knowl Discov , vol.20 , pp. 191-220
    • Wu, J.1    Xiong, H.2    Chen, J.3
  • 41
    • 0003259364 scopus 로고    scopus 로고
    • Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
    • Morgan Kaufmann, San Francisco, CA, USA, 2001
    • Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In: Proceedings of the 18th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, USA, 2001, pp. 609-616
    • (2001) Proceedings of the 18th International Conference on Machine Learning , pp. 609-616
    • Zadrozny, B.1    Elkan, C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.