SCOPUS 정보 검색 플랫폼

Volumn 44, Issue 4, 2015, Pages 467-508

Dealing with the evaluation of supervised classification algorithms

(3) Santafe, Guzman a Inza, Iñaki b Lozano, Jose A b

b UNIVERSITY OF THE BASQUE COUNTRY UPV EHU (Spain)

Author keywords

Classification algorithms comparison; Classifier evaluation; Estimation methods; Quality measures; Supervised classification

Indexed keywords

ALGORITHMS; QUALITY CONTROL;

CLASSIFICATION ALGORITHM; CLASSIFIER EVALUATION; ESTIMATION METHODS; QUALITY MEASURES; SUPERVISED CLASSIFICATION;

CLASSIFICATION (OF INFORMATION);

EID: 84945479662 PISSN: 02692821 EISSN: None Source Type: Journal
DOI: 10.1007/s10462-015-9433-y Document Type: Article

Times cited : (96)

References (132)

1
- 24044435942
- Reducing multiclass to binary: A unifying approach for margin classifiers
- Allwein EL, Schapire RE, Singer Y (2001) Reducing multiclass to binary: A unifying approach for margin classifiers. J Mach Learn Res 1(2):113–141
- (2001) J Mach Learn Res , vol.1 , Issue.2 , pp. 113-141
- Allwein, E.L.¹ Schapire, R.E.² Singer, Y.³

2
- 84945478546
- Anagnostopoulos C, Hand DJ (2012) hmeasure: the H-measure and other scalar classification performance metrics. , R package version 1.0
- Anagnostopoulos C, Hand DJ (2012) hmeasure: the H-measure and other scalar classification performance metrics. http://CRAN.R-project.org/package=hmeasure, R package version 1.0

3
- 0033220735
- Measure-based classifier performance evaluation
- Andersson A, Davidsson P, Linén J (1999) Measure-based classifier performance evaluation. Pattern Recognit Lett 11–13(20):1165–1173
- (1999) Pattern Recognit Lett , vol.11-13 , Issue.20 , pp. 1165-1173
- Andersson, A.¹ Davidsson, P.² Linén, J.³

4
- 77950806386
- A new performance measure for class imbalance learning. Application to bioinformatics problems. In: Proceedings of the 26th international conference on machine learning and applications
- Batuwita R, Palade V (2009) A new performance measure for class imbalance learning. Application to bioinformatics problems. In: Proceedings of the 26th international conference on machine learning and applications, pp 545–550
- (2009) pp 545–550
- Batuwita, R.¹ Palade, V.²

5
- 84925604888
- No unbiased estimator of the variance of k-fold cross-validation
- Bengio Y, Grandvalet Y (2004) No unbiased estimator of the variance of k-fold cross-validation. J Mach Learn Res 5:1089–1105
- (2004) J Mach Learn Res , vol.5 , pp. 1089-1105
- Bengio, Y.¹ Grandvalet, Y.²

6
- 84863051852
- Bias in estimating the variance of k-fold cross-validation
- Duchesne P, Rémillard B, (eds), Springer, Berlin
- Bengio Y, Grandvalet Y (2005) Bias in estimating the variance of k-fold cross-validation. In: Duchesne P, Rémillard B (eds) Statistical modeling and analysis for complex data problems, chap 5. Springer, Berlin, pp 75–95
- (2005) Statistical modeling and analysis for complex data problems, chap 5 , pp. 75-95
- Bengio, Y.¹ Grandvalet, Y.²

7
- 84877652134
- Significance tests or confidence intervals: which are preferable for the comparison of classifiers?
- Berrar D, Lozano JA (2013) Significance tests or confidence intervals: which are preferable for the comparison of classifiers? J Exp Theor Artif Intell 25(2):189–206
- (2013) J Exp Theor Artif Intell , vol.25 , Issue.2 , pp. 189-206
- Berrar, D.¹ Lozano, J.A.²

8
- 14544275490
- Estimationg replicability of classifier learning experiments. In: Brodley CE (ed) Proceedings of the 21st international conference on machine learning
- Bouckaert RR (2004) Estimationg replicability of classifier learning experiments. In: Brodley CE (ed) Proceedings of the 21st international conference on machine learning. ACM
- (2004) ACM
- Bouckaert, R.R.¹

9
- 7444237797
- Evaluating the replicability of significance tests fo comparing learning algorihtms. In: Proceedings of the 8th Pacifica-Asian conference on knowledge discovery and data mining
- Bouckaert RR, Frank E (2004) Evaluating the replicability of significance tests fo comparing learning algorihtms. In: Proceedings of the 8th Pacifica-Asian conference on knowledge discovery and data mining, pp 3–12
- (2004) pp 3–12
- Bouckaert, R.R.¹ Frank, E.²

10
- 84886486483
- Area under the precision-recall curve: point estimates and confidence intervals
- Boyd K, Eng KH, Page CD (2013) Area under the precision-recall curve: point estimates and confidence intervals. In: Machine learning and knowledge discovery in databases. ECML PKDD 2013, Part III, pp 451–466
- (2013) Machine learning and knowledge discovery in databases. ECML PKDD 2013, Part , vol.III , pp. 451-466
- Boyd, K.¹ Eng, K.H.² Page, C.D.³

11
- 0031191630
- The use of the area under the ROC curve in the evaluation of machine learning algorithms
- Bradley A (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159
- (1997) Pattern Recognit , vol.30 , Issue.7 , pp. 1145-1159
- Bradley, A.¹

12
- 2142758118
- Bolstered error estimation
- Braga-Neto U, Dougherty E (2004) Bolstered error estimation. Pattern Recognit 37(6):1267–1281
- (2004) Pattern Recognit , vol.37 , Issue.6 , pp. 1267-1281
- Braga-Neto, U.¹ Dougherty, E.²

13
- 84945437550
- On the effect of data set size on bias and variance in classification learning. In: Proceedings of the 4th Australian knowledge acquisition workshop
- Brain D, Webb GI (1999) On the effect of data set size on bias and variance in classification learning. In: Proceedings of the 4th Australian knowledge acquisition workshop, pp 117–128
- (1999) pp 117–128
- Brain, D.¹ Webb, G.I.²

14
- 84864854508
- The need for low bias algorithms in classification learning from large data sets. In: Proceedings of the 16th European conference principles of data mining and knowledge discovery
- Brain D, Webb GI (2002) The need for low bias algorithms in classification learning from large data sets. In: Proceedings of the 16th European conference principles of data mining and knowledge discovery, pp 62–73
- (2002) pp 62–73
- Brain, D.¹ Webb, G.I.²

15
- 0003010182
- Verification of forecasts expressed in terms of probability
- Brier GW (1950) Verification of forecasts expressed in terms of probability. Monthly Weather Rev 78:1–3
- (1950) Monthly Weather Rev , vol.78 , pp. 1-3
- Brier, G.W.¹

16
- 84884946129
- Density-preserving sampling: robust and efficient alternative to cross-validation for error estimation
- Budka M (2013) Density-preserving sampling: robust and efficient alternative to cross-validation for error estimation. IEEE Trans Neural Netw Learn Syst 24(1):22–34
- (2013) IEEE Trans Neural Netw Learn Syst , vol.24 , Issue.1 , pp. 22-34
- Budka, M.¹

17
- 0000354976
- A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods
- Burman P (1989) A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3):503–514
- (1989) Biometrika , vol.76 , Issue.3 , pp. 503-514
- Burman, P.¹

18
- 84945447144
- Lambert Academic Publishing, Saarbrücken
- Calvo B (2010) Positive unlabeled learning with applications in computational biology. Lambert Academic Publishing, Saarbrücken
- (2010) Positive unlabeled learning with applications in computational biology
- Calvo, B.¹

19
- 27144549260
- Editorial: Special issue on learning from imbalanced data sets
- Chawla NV, Japkowicz N (2004) Editorial: Special issue on learning from imbalanced data sets. ACM SIGKDD Explor Newslett 6(1):2000–2004
- (2004) ACM SIGKDD Explor Newslett , vol.6 , Issue.1 , pp. 2000-2004
- Chawla, N.V.¹ Japkowicz, N.²

20
- 0039802908
- The earth is round ((Formula presented.))
- Cohen J (1994) The earth is round ($$p <.05$$p<.05). Am Psychol 49:997–1003
- (1994) Am Psychol , vol.49 , pp. 997-1003
- Cohen, J.¹

21
- 84897965802
- Cortes C, Mohri M (2004) AUC optimization vs. error rate minimization. In: Proceedings of the 16th advances in neural information processing systems conference, p 313
- Cortes C, Mohri M (2004) AUC optimization vs. error rate minimization. In: Proceedings of the 16th advances in neural information processing systems conference, p 313

22
- 0004086493
- Duxbury Thomson Learning, Pacific Grove
- Daniel WW (1990) Applied nonparametric statistics. Duxbury Thomson Learning, Pacific Grove
- (1990) Applied nonparametric statistics
- Daniel, W.W.¹

23
- 33749249600
- The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning
- Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning, pp 233–240
- (2006) pp 233–240
- Davis, J.¹ Goadrich, M.²

24
- 0003484780
- Cambridge University Press, Cambridge
- Davison A, Hinkley D (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge
- (1997) Bootstrap methods and their application
- Davison, A.¹ Hinkley, D.²

25
- 0001740042
- Calibration-based empirical probability
- Dawid A (1985) Calibration-based empirical probability. Ann Stat 13(4):1251–1274
- (1985) Ann Stat , vol.13 , Issue.4 , pp. 1251-1274
- Dawid, A.¹

26
- 29644438050
- Statistical comparisons of classifiers over multiple data sets
- Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
- (2006) J Mach Learn Res , vol.7 , pp. 1-30
- Demsar, J.¹

27
- 84992088692
- On the appropriateness of statistical tests in machine learning
- Demsar J (2008) On the appropriateness of statistical tests in machine learning. In: 3rd workshop on evaluation methods for machine learning
- (2008) In: 3rd workshop on evaluation methods for machine learning
- Demsar, J.¹

28
- 3042555481
- An alternative to null-hypothesis significance tests
- Denis DJ (2003) An alternative to null-hypothesis significance tests. Theory Sci 4(1)
- (2003) Theory Sci , vol.4 , Issue.1
- Denis, D.J.¹

29
- 79551493545
- Maximum likelihood in cost-sensitive learning: model specification, approximations, and upper bounds
- Dmochowski JP, Sajda P, Parra LC (2010) Maximum likelihood in cost-sensitive learning: model specification, approximations, and upper bounds. J Mach Learn Res 11:3313–3332
- (2010) J Mach Learn Res , vol.11 , pp. 3313-3332
- Dmochowski, J.P.¹ Sajda, P.² Parra, L.C.³

30
- 84945463099
- Machine learning as an experimental science (revisited)
- Drummond C (2006) Machine learning as an experimental science (revisited). In: Proceedings of the 1st workshop on evaluation methods for machine learning
- (2006) In: Proceedings of the 1st workshop on evaluation methods for machine learning
- Drummond, C.¹

31
- 84877682028
- Finding a balance between anarchy and orthodoxy
- Drummond C (2008) Finding a balance between anarchy and orthodoxy. In: Proceedings of the 3rd workshop on evaluation methods for machine learning
- (2008) In: Proceedings of the 3rd workshop on evaluation methods for machine learning
- Drummond, C.¹

32
- 33748991193
- Cost curves: an improved methyod for visualizing classifier performance
- Drummond C, Holte RC (2006) Cost curves: an improved methyod for visualizing classifier performance. Mach Learn 65(1):95–130
- (2006) Mach Learn , vol.65 , Issue.1 , pp. 95-130
- Drummond, C.¹ Holte, R.C.²

33
- 77951200774
- Warning: Statistical benchmarking is addictive. Kicking the habit in machine learning
- Drummond C, Japkowicz N (2010) Warning: Statistical benchmarking is addictive. Kicking the habit in machine learning. J Exp Theor Artif Intell 22(1):67–80
- (2010) J Exp Theor Artif Intell , vol.22 , Issue.1 , pp. 67-80
- Drummond, C.¹ Japkowicz, N.²

34
- 0002344794
- Bootstrap methods: another look at the jackknife
- Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7(1):1–26
- (1979) Ann Stat , vol.7 , Issue.1 , pp. 1-26
- Efron, B.¹

35
- 0003242435
- The jackknife, the bootstrap and other resampling plans
- Efron B (1982) The jackknife, the bootstrap and other resampling plans. Soc Ind Appl Math
- (1982) Soc Ind Appl Math
- Efron, B.¹

36
- 84950461478
- Estimating the error rate of a prediction rule: improvement on cross-validation
- Efron B (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. J Am Stat Assoc 78(382):316–331
- (1983) J Am Stat Assoc , vol.78 , Issue.382 , pp. 316-331
- Efron, B.¹

37
- 84964203940
- Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy
- Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistics 1(1):54–77
- (1986) Statistics , vol.1 , Issue.1 , pp. 54-77
- Efron, B.¹ Tibshirani, R.²

38
- 0003991665
- Chapman & Hall, London
- Efron B, Tibshirani R (1993) An Introduction to the Bootstrap. Chapman & Hall, London
- (1993) An Introduction to the Bootstrap
- Efron, B.¹ Tibshirani, R.²

39
- 0031536511
- Improvements on cross-validation: the 632+ bootstrap method
- Efron B, Tibshirani R (1997) Improvements on cross-validation: the 632+ bootstrap method. J Am Stat Assoc 92(438):548–560
- (1997) J Am Stat Assoc , vol.92 , Issue.438 , pp. 548-560
- Efron, B.¹ Tibshirani, R.²

40
- 0031239175
- Robustness metrics for measuring the influence of additive noise on the performance of statistical classifiers
- Egmont-Petersen M, Talmon JL, Hasman A (1997) Robustness metrics for measuring the influence of additive noise on the performance of statistical classifiers. Int J Med Inform 46:103–112
- (1997) Int J Med Inform , vol.46 , pp. 103-112
- Egmont-Petersen, M.¹ Talmon, J.L.² Hasman, A.³

41
- 84945456852
- A framework for measuring classification difference with imbalance
- Elazmeh W, Japkowicz N, Matwin S (2006) A framework for measuring classification difference with imbalance. In: Proceedings of the 1st workshop on evaluation methods for machine learning
- (2006) In: Proceedings of the 1st workshop on evaluation methods for machine learning
- Elazmeh, W.¹ Japkowicz, N.² Matwin, S.³

42
- 84867577175
- The foundations of cost-sensitive learning
- Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the 4th international joint conference on artificial intelligence, vol 17, pp 973–978
- (2001) Proceedings of the 4th international joint conference on artificial intelligence , vol.17 , pp. 973-978
- Elkan, C.¹

43
- 33646023117
- An introduction to ROC analysis
- Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874
- (2006) Pattern Recognit Lett , vol.27 , Issue.8 , pp. 861-874
- Fawcett, T.¹

44
- 70349280929
- An experimental comparison of performance measures for classification
- Ferri C, Hernández-Orallo R, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognit Lett 30:27–38
- (2009) Pattern Recognit Lett , vol.30 , pp. 27-38
- Ferri, C.¹ Hernández-Orallo, R.² Modroiu, R.³

45
- 21144459575
- On a monotonicity problem in step-down multiple test procedures
- Finner H (1993) On a monotonicity problem in step-down multiple test procedures. J Am Stat Assoc 88:920–923
- (1993) J Am Stat Assoc , vol.88 , pp. 920-923
- Finner, H.¹

46
- 0003459504
- Hafner publishing Co, New York
- Fisher RA (1937) Statistical methods and scientific inference. Hafner publishing Co, New York
- (1937) Statistical methods and scientific inference
- Fisher, R.A.¹

47
- 21744462998
- On bias, variance, 0/1 loss, and the curse-of-dimensionality
- Friedman JH (1997) On bias, variance, 0/1 loss, and the curse-of-dimensionality. Data Min Knowl Discov 1:55–77
- (1997) Data Min Knowl Discov , vol.1 , pp. 55-77
- Friedman, J.H.¹

48
- 0001837148
- A comparison of alternative tests of significance for the problem of m rankings
- Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
- (1940) Ann Math Stat , vol.11 , pp. 86-92
- Friedman, M.¹

49
- 79951551258
- Estimation of prediction error by using k-fold cross-validation
- Fushiki T (2011) Estimation of prediction error by using k-fold cross-validation. Stat Comput 21(2):137–146
- (2011) Stat Comput , vol.21 , Issue.2 , pp. 137-146
- Fushiki, T.¹

50
- 79953051509
- An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes
- Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2011) An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognit 44:1761–1776
- (2011) Pattern Recognit , vol.44 , pp. 1761-1776
- Galar, M.¹ Fernández, A.² Barrenechea, E.³ Bustince, H.⁴ Herrera, F.⁵

51
- 85020715458
- Chapman and Hall/CRC, London
- Gama J (2010) Knowledge Discovery from Data Streams. Chapman and Hall/CRC, London
- (2010) Knowledge Discovery from Data Streams
- Gama, J.¹

52
- 70350664414
- Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
- Gama J, Sebastiao R, Pereira Rodrigues P (2009) Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 329–338
- (2009) pp 329–338
- Gama, J.¹ Sebastiao, R.² Pereira Rodrigues, P.³

53
- 58149287952
- An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons
- Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694
- (2008) J Mach Learn Res , vol.9 , pp. 2677-2694
- Garcia, S.¹ Herrera, F.²

54
- 77549084648
- Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power
- Garcia S, Fernandez A, Luengo J, Herrera F (2010a) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064
- (2010) Inf Sci , vol.180 , Issue.10 , pp. 2044-2064
- Garcia, S.¹ Fernandez, A.² Luengo, J.³ Herrera, F.⁴

55
- 78149483936
- Theoretical analysis of a performance measure for imbalanced data. In: Proceedings of the 18th IEEE international conference on pattern recognition
- Garcia V, Mollineda RA, Sanchez JS (2010b) Theoretical analysis of a performance measure for imbalanced data. In: Proceedings of the 18th IEEE international conference on pattern recognition, pp 617–620
- (2010) pp 617–620
- Garcia, V.¹ Mollineda, R.A.² Sanchez, J.S.³

56
- 19644394949
- Likelihood ratios: A simple and flexible statistic for empirical psychologists
- Glover S, Dixon P (2004) Likelihood ratios: A simple and flexible statistic for empirical psychologists. Psychon Bull Rev 11(5):791–806
- (2004) Psychon Bull Rev , vol.11 , Issue.5 , pp. 791-806
- Glover, S.¹ Dixon, P.²

57
- 28444452344
- Permutation tests for classification: towards statistical significance in image-based studies
- Golland P, Fischl B (2003) Permutation tests for classification: towards statistical significance in image-based studies. In: Proceedings of the 18th international conference on information processing in medical imaging, vol 18, pp 330–341
- (2003) Proceedings of the 18th international conference on information processing in medical imaging , vol.18 , pp. 330-341
- Golland, P.¹ Fischl, B.²

58
- 26944451288
- Permutation tests for classification
- Golland P, Liang F, Makherjee S, Panchenko D (2005) Permutation tests for classification. In: Proceedings of the 18th annual conference on learning Theory, vol 18, pp 501–515
- (2005) Proceedings of the 18th annual conference on learning Theory , vol.18 , pp. 501-515
- Golland, P.¹ Liang, F.² Makherjee, S.³ Panchenko, D.⁴

59
- 34447464262
- Corroboration, explanation, evolving probability, simplicity, and a sharpened razor
- Good IJ (1968) Corroboration, explanation, evolving probability, simplicity, and a sharpened razor. Br J Philos Sci 19:123–143
- (1968) Br J Philos Sci , vol.19 , pp. 123-143
- Good, I.J.¹

60
- 0003504212
- Permutation test: a practical guide to resampling methods for testing hypotheses
- Good PI (2000) Permutation test: a practical guide to resampling methods for testing hypotheses. Springer
- (2000) Springer
- Good, P.I.¹

61
- 45749100408
- A dirty dozen: twelve p-value misconceptions
- Goodman S (2008) A dirty dozen: twelve p-value misconceptions. Semin Hematol 45(3):135–140
- (2008) Semin Hematol , vol.45 , Issue.3 , pp. 135-140
- Goodman, S.¹

62
- 77949462754
- Hypothesis testing for cross-validation. Tech
- Département d’informatique et recherche opérationnelle, Université de Montréal
- Grandvalet Y, Bengio Y (2006) Hypothesis testing for cross-validation. Tech. rep., Département d’informatique et recherche opérationnelle, Université de Montréal
- (2006) rep.
- Grandvalet, Y.¹ Bengio, Y.²

63
- 76749092270
- The WEKA data mining software: an update
- Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18
- (2009) SIGKDD Explor , vol.11 , Issue.1 , pp. 10-18
- Hall, M.¹ Frank, E.² Holmes, G.³ Pfahringer, B.⁴ Reutemann, P.⁵ Witten, I.H.⁶

64
- 0038606580
- Misinterpretations of significance: A problem students share with their teachers
- Haller H, Krauss S (2002) Misinterpretations of significance: A problem students share with their teachers. Methods Psychol Res Online 7(1):1–20
- (2002) Methods Psychol Res Online , vol.7 , Issue.1 , pp. 1-20
- Haller, H.¹ Krauss, S.²

65
- 0000493076
- Reliability diagrams for multicategory probabilistic forecast
- Hamill TM (1996) Reliability diagrams for multicategory probabilistic forecast. Weather Forecast 12(4):736–741
- (1996) Weather Forecast , vol.12 , Issue.4 , pp. 736-741
- Hamill, T.M.¹

66
- 0001020401
- Recent advances in error rate estimation
- Hand DJ (1986) Recent advances in error rate estimation. Pattern Recognit Lett 4(5):335–346
- (1986) Pattern Recognit Lett , vol.4 , Issue.5 , pp. 335-346
- Hand, D.J.¹

67
- 21844516840
- Deconstructing statistical questions
- Hand DJ (1994) Deconstructing statistical questions. J R Stat Soc Ser A 157(3):317–356
- (1994) J R Stat Soc Ser A , vol.157 , Issue.3 , pp. 317-356
- Hand, D.J.¹

68
- 69549133517
- Measuring classifier performance: a coherent alternative to the area under de ROC curve
- Hand DJ (2009) Measuring classifier performance: a coherent alternative to the area under de ROC curve. Mach Learn 77:103–123
- (2009) Mach Learn , vol.77 , pp. 103-123
- Hand, D.J.¹

69
- 77953705269
- Evaluation diagnostic tests: the area under the ROC curve and the balance of errors
- Hand DJ (2010) Evaluation diagnostic tests: the area under the ROC curve and the balance of errors. Stat Med 29:1502–1510
- (2010) Stat Med , vol.29 , pp. 1502-1510
- Hand, D.J.¹

70
- 84873807222
- When is the area under the receiver operating characteristic curve an appropriate measure of classifier performance?
- Hand DJ, Anagnostopoulos C (2013) When is the area under the receiver operating characteristic curve an appropriate measure of classifier performance? Pattern Recognit Lett 34(5):492–495
- (2013) Pattern Recognit Lett , vol.34 , Issue.5 , pp. 492-495
- Hand, D.J.¹ Anagnostopoulos, C.²

71
- 84892916856
- A better Beta for the H measure of classification performance
- Hand DJ, Anagnostopoulos C (2014) A better Beta for the H measure of classification performance. Pattern Recogn Lett 40:41–46
- (2014) Pattern Recogn Lett , vol.40 , pp. 41-46
- Hand, D.J.¹ Anagnostopoulos, C.²

72
- 0003562954
- A simple generalisation of the area under the ROC curve for multiple class classification problems
- Hand DJ, Till RJ (2001) A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn 45:171–186
- (2001) Mach Learn , vol.45 , pp. 171-186
- Hand, D.J.¹ Till, R.J.²

73
- 0003684449
- Springer, Berlin
- Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, Berlin
- (2001) The elements of statistical learning
- Hastie, T.¹ Tibshirani, R.² Friedman, J.³

74
- 0022965475
- An improved sequentially rejective bonferroni test procedure
- Holland BS, Copenhaver MD (1987) An improved sequentially rejective bonferroni test procedure. Biometrics 43:417–423
- (1987) Biometrics , vol.43 , pp. 417-423
- Holland, B.S.¹ Copenhaver, M.D.²

75
- 0038047907
- Relation between permutation-test p values and classifier error estimates
- Hsing T, Attoor S, Dougherty E (2003) Relation between permutation-test p values and classifier error estimates. Mach Learn 52(1):11–30
- (2003) Mach Learn , vol.52 , Issue.1 , pp. 11-30
- Hsing, T.¹ Attoor, S.² Dougherty, E.³

76
- 0001750957
- Approximations of the critical region of the friedman statistic
- Iman RL, Davenport JM (1980) Approximations of the critical region of the friedman statistic. Commun Stat 9:571–595
- (1980) Commun Stat , vol.9 , pp. 571-595
- Iman, R.L.¹ Davenport, J.M.²

77
- 50349090268
- Cross-validation and bootstrapping are unreliable in small sample classification
- Isaksson A, Wallman M, Goransson H, Gustafsson M (2008) Cross-validation and bootstrapping are unreliable in small sample classification. Pattern Recognit Lett 29(14):1960–1965
- (2008) Pattern Recognit Lett , vol.29 , Issue.14 , pp. 1960-1965
- Isaksson, A.¹ Wallman, M.² Goransson, H.³ Gustafsson, M.⁴

78
- 49349117698
- Mining supervised classification performance studies: a meta-analytic investigation
- Jamain A, Hand DJ (2008) Mining supervised classification performance studies: a meta-analytic investigation. J Classif 25:87–112
- (2008) J Classif , vol.25 , pp. 87-112
- Jamain, A.¹ Hand, D.J.²

79
- 84945470092
- Why question machine learning evaluation methods (an illustrative review of the shortcomings of current methods)
- Japkowicz N (2006) Why question machine learning evaluation methods (an illustrative review of the shortcomings of current methods). In: Proceedings of the 1st workshop on evaluation methods for machine learning
- (2006) In: Proceedings of the 1st workshop on evaluation methods for machine learning
- Japkowicz, N.¹

80
- 77949493515
- Classifier evaluation: a need for better education and restructuring
- Japkowicz N (2008) Classifier evaluation: a need for better education and restructuring. In: Proceedings of the 3rd workshop on evaluation methods for machine learning
- (2008) In: Proceedings of the 3rd workshop on evaluation methods for machine learning
- Japkowicz, N.¹

81
- 84924514239
- Cambridge University Press, Cambridge, A classification perspective
- Japkowicz N, Shah M (2011) Evaluating learning algorithms. Cambridge University Press, Cambridge, A classification perspective
- (2011) Evaluating learning algorithms
- Japkowicz, N.¹ Shah, M.²

82
- 0040790800
- Confidence intervals vs. bayesian intervals
- Jaynes ET (1976) Confidence intervals vs. bayesian intervals. Found Probab Theory Stat Inference Stat Theor Sci 2:175–257
- (1976) Found Probab Theory Stat Inference Stat Theor Sci , vol.2 , pp. 175-257
- Jaynes, E.T.¹

83
- 0032808670
- The insignificance of statistical significance testing
- Johnson DH (1999) The insignificance of statistical significance testing. J Wildl Manag 63(3):763–772
- (1999) J Wildl Manag , vol.63 , Issue.3 , pp. 763-772
- Johnson, D.H.¹

84
- 84866702739
- Scalable active learning for multiclass image classification
- Joshi A, Porikli F, Papanikolopoulos NP (2012) Scalable active learning for multiclass image classification. IEEE Trans Pattern Anal Mach Intell 34(11):2259–2273
- (2012) IEEE Trans Pattern Anal Mach Intell , vol.34 , Issue.11 , pp. 2259-2273
- Joshi, A.¹ Porikli, F.² Papanikolopoulos, N.P.³

85
- 0038969996
- Mining needle in a haystack: classifying rare classes via two-phase rule induction. In: Proceedings of the 27th ACM SIGMOD international conference on management of data
- Joshi MV, Agarwal RC, Kumar V (2001) Mining needle in a haystack: classifying rare classes via two-phase rule induction. In: Proceedings of the 27th ACM SIGMOD international conference on management of data, pp 91–102
- (2001) pp 91–102
- Joshi, M.V.¹ Agarwal, R.C.² Kumar, V.³

86
- 85164392958
- A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence
- Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence, pp 1137–1143
- (1995) pp 1137–1143
- Kohavi, R.¹

87
- 0002872346
- Bias plus variance decomposition for zero-one loss functions
- Kohavi R, Wolpert DH (1996) Bias plus variance decomposition for zero-one loss functions. In: Saitta L (ed) Proceedings of the 13th international conference on machine learning, Morgan Kaumann, pp 275–283
- (1996) Proceedings of the 13th international conference on machine learning, Morgan Kaumann , pp. 275-283
- Kohavi, R.¹ Wolpert, D.H.² Saitta, L.³

88
- 84943709252
- Use of ranks in one-criterion variance analysis
- Kruskal W, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47(260):583–621
- (1952) J Am Stat Assoc , vol.47 , Issue.260 , pp. 583-621
- Kruskal, W.¹ Wallis, W.A.²

89
- 0031998121
- Machine learning for the detection of oil spills in satellite radar images
- Kubat M, Holte RC, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2):195–215
- (1998) Mach Learn , vol.30 , Issue.2 , pp. 195-215
- Kubat, M.¹ Holte, R.C.² Matwin, S.³

90
- 84945495763
- Kuhn M (2015) Caret: classification and regression training. , R package version 6.0-41
- Kuhn M (2015) Caret: classification and regression training. http://CRAN.R-project.org/package=caret, R package version 6.0-41

91
- 84919912244
- Bayesian comparison of machine learning algorithms on single and multiple datasets. In: Proceedings of the 15th international conference on artificial intellegence and statistics
- Lacoste A, Laviolette F, Marchand M (2012) Bayesian comparison of machine learning algorithms on single and multiple datasets. In: Proceedings of the 15th international conference on artificial intellegence and statistics, pp 665–675
- (2012) pp 665–675
- Lacoste, A.¹ Laviolette, F.² Marchand, M.³

92
- 0007318509
- The shrinkage of the coefficient of multiple correlation
- Larson SC (1931) The shrinkage of the coefficient of multiple correlation. J Educ Psychol 22:45–55
- (1931) J Educ Psychol , vol.22 , pp. 45-55
- Larson, S.C.¹

93
- 84930508557
- Master’s thesis: Blekinge Institute of Technology
- Lavesson N (2006) Evaluation of supervised learning algorithms and classifiers. Master’s thesis, Blekinge Institute of Technology
- (2006) Evaluation of supervised learning algorithms and classifiers
- Lavesson, N.¹

94
- 85161651554
- Data mining for direct marketing: Problems and solutions. In: Proceedings of the 4th international conference on knowledge discovery and data minig
- Ling CX, Li C (1998) Data mining for direct marketing: Problems and solutions. In: Proceedings of the 4th international conference on knowledge discovery and data minig, pp 73–79
- (1998) pp 73–79
- Ling, C.X.¹ Li, C.²

95
- 80052841314
- A tutorial on a practical bayesian alternative to null-hypothesis significance testing
- Masson M (2011) A tutorial on a practical bayesian alternative to null-hypothesis significance testing. Behav Res Methods 43(3):679–90
- (2011) Behav Res Methods , vol.43 , Issue.3 , pp. 679-690
- Masson, M.¹

96
- 0030765252
- Confidence intervals for differences in correlated binary proportions
- May WL, Johnson WD (1997) Confidence intervals for differences in correlated binary proportions. Stat Med 16(18):2127–2136
- (1997) Stat Med , vol.16 , Issue.18 , pp. 2127-2136
- May, W.L.¹ Johnson, W.D.²

97
- 0004027527
- Wiley, New York
- McLachlan G (1992) Discriminant analysis and statistical pattern recognition. Wiley, New York
- (1992) Discriminant analysis and statistical pattern recognition
- McLachlan, G.¹

98
- 80052714543
- A unifying view on dataset shift in classification
- Moreno-Torres JG, Reader T, Aláiz-Rodriíguez R, Chawla NV, Herrera F (2012a) A unifying view on dataset shift in classification. Pattern Recognit 45(1):521–530
- (2012) Pattern Recognit , vol.45 , Issue.1 , pp. 521-530
- Moreno-Torres, J.G.¹ Reader, T.² Aláiz-Rodriíguez, R.³ Chawla, N.V.⁴ Herrera, F.⁵

99
- 84876917722
- Study on the impact of partition-induced dataset shift on k-fold cross-validation
- Moreno-Torres JG, Sáez JA, Herrera F (2012b) Study on the impact of partition-induced dataset shift on k-fold cross-validation. IEEE Trans Neural Netw Learn Syst 23(8):1304–1312
- (2012) IEEE Trans Neural Netw Learn Syst , vol.23 , Issue.8 , pp. 1304-1312
- Moreno-Torres, J.G.¹ Sáez, J.A.² Herrera, F.³

100
- 0042847140
- Inference for the generalization error
- Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281
- (2003) Mach Learn , vol.52 , Issue.3 , pp. 239-281
- Nadeau, C.¹ Bengio, Y.²

101
- 0013152011
- Towards the personalization of algorihtms evaluation in data mining
- Nakhaeizadeh G, Schnabl A (1998) Towards the personalization of algorihtms evaluation in data mining. In. In Proceedings of the 3rd international conference on knowledge discovery and data mining, pp 289–293
- (1998) In. In Proceedings of the 3rd international conference on knowledge discovery and data mining , pp. 289-293
- Nakhaeizadeh, G.¹ Schnabl, A.²

102
- 77954676863
- Permutation tests for studying classifier performance
- Ojala M, Garriga GC (2010) Permutation tests for studying classifier performance. J Mach Learn Res 11:1833–1863
- (2010) J Mach Learn Res , vol.11 , pp. 1833-1863
- Ojala, M.¹ Garriga, G.C.²

103
- 84884973790
- Bootstrap analysis of multiple repetitions of experiments using an interval-valued multiple comparison procedure
- Otero J, Sánchez L, Couso I, Palacios A (2014) Bootstrap analysis of multiple repetitions of experiments using an interval-valued multiple comparison procedure. J Comput Syst Sci 80(1):88–100
- (2014) J Comput Syst Sci , vol.80 , Issue.1 , pp. 88-100
- Otero, J.¹ Sánchez, L.² Couso, I.³ Palacios, A.⁴

104
- 80053222008
- A survey on graphical methods for classification predictive performance evaluation
- Prati RC, Batista GEPA, Monard MC (2011) A survey on graphical methods for classification predictive performance evaluation. IEEE Trans Knowl Data Eng 23(11):1601–1618
- (2011) IEEE Trans Knowl Data Eng , vol.23 , Issue.11 , pp. 1601-1618
- Prati, R.C.¹ Batista, G.E.P.A.² Monard, M.C.³

105
- 0002900357
- The case against accuracy estimation for comparing induction algorithms. In: Proceeding of the 15th international conference on machine learning
- Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing induction algorithms. In: Proceeding of the 15th international conference on machine learning, pp 445–453
- (1998) pp 445–453
- Provost, F.¹ Fawcett, T.² Kohavi, R.³

106
- 0024689801
- A critical investigation of recall and precision as measures of retrieval system performance
- Raghavan V, Bollmann P, Jung GS (1989) A critical investigation of recall and precision as measures of retrieval system performance. ACM Trans Inf Syst 7(3):205–229
- (1989) ACM Trans Inf Syst , vol.7 , Issue.3 , pp. 205-229
- Raghavan, V.¹ Bollmann, P.² Jung, G.S.³

107
- 34547372256
- Optimized precision–a new measure for classifier performance evaluation. In: Proceedings of the 23th IEEE international conference on evolutionary computation
- Ranawana R, Palade V (2006) Optimized precision–a new measure for classifier performance evaluation. In: Proceedings of the 23th IEEE international conference on evolutionary computation, pp 2254–2261
- (2006) pp 2254–2261
- Ranawana, R.¹ Palade, V.²

108
- 79951756007
- Consequences of variability in classifier performance estimates. In: Proceedings of the 10th IEEE international conference on data mining
- Reader T, Hoens TR, Chawla NV (2010) Consequences of variability in classifier performance estimates. In: Proceedings of the 10th IEEE international conference on data mining, pp 421–430
- (2010) pp 421–430
- Reader, T.¹ Hoens, T.R.² Chawla, N.V.³

109
- 56749117943
- In defense of one-vs-all classification
- Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141
- (2004) J Mach Learn Res , vol.5 , pp. 101-141
- Rifkin, R.¹ Klautau, A.²

110
- 85008025524
- Sensitivity analysis of k-fold cross validation in prediction error estimation
- Rodríguez JD, Pérez A, Lozano JA (2010) Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans Pattern Anal Mach Intell 32(3):569–575
- (2010) IEEE Trans Pattern Anal Mach Intell , vol.32 , Issue.3 , pp. 569-575
- Rodríguez, J.D.¹ Pérez, A.² Lozano, J.A.³

111
- 84870248309
- A general framework for the statistical analysis of the sources of variance for classification error estimators
- Rodríguez JD, Pérez A, Lozano JA (2013) A general framework for the statistical analysis of the sources of variance for classification error estimators. Pattern Recognit 46(3):855–864
- (2013) Pattern Recognit , vol.46 , Issue.3 , pp. 855-864
- Rodríguez, J.D.¹ Pérez, A.² Lozano, J.A.³

112
- 0001618721
- A sequentially rejective test procedure based on a modified bonferroni inequality
- Rom DM (1990) A sequentially rejective test procedure based on a modified bonferroni inequality. Biometrika 77:663–665
- (1990) Biometrika , vol.77 , pp. 663-665
- Rom, D.M.¹

113
- 0000130677
- The fallacy of the null-hypothesis significance test
- Rozeboom W (1960) The fallacy of the null-hypothesis significance test. Psychol Bull 57(5):416–428
- (1960) Psychol Bull , vol.57 , Issue.5 , pp. 416-428
- Rozeboom, W.¹

114
- 77957991523
- The ROC manifold for classification systems
- Schubert CM, Thorsen SN, Oxley ME (2011) The ROC manifold for classification systems. Pattern Recognit 44(2):350–362
- (2011) Pattern Recognit , vol.44 , Issue.2 , pp. 350-362
- Schubert, C.M.¹ Thorsen, S.N.² Oxley, M.E.³

115
- 11944262819
- Multiple hypothesis testing
- Shaffer JP (1995) Multiple hypothesis testing. Annu Rev Psychol 46:551–584
- (1995) Annu Rev Psychol , vol.46 , pp. 551-584
- Shaffer, J.P.¹

116
- 78651375098
- A survey of hierarchical classification across different application domains
- Silla CN, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Discov 22(1–2):31–72
- (2011) Data Min Knowl Discov , vol.22 , Issue.1-2 , pp. 31-72
- Silla, C.N.¹ Freitas, A.A.²

117
- 0001352933
- Some examples of discrimination
- Smith C (1947) Some examples of discrimination. Ann Eugen 13:272–282
- (1947) Ann Eugen , vol.13 , pp. 272-282
- Smith, C.¹

118
- 84945461319
- Beyond accuracy, f-score and ROC: a family of discriminant measures for performance evaluation. In: Proceedings of the 19th Australian joint conference on artificial intelligence: advances in artificial intelligence
- Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, f-score and ROC: a family of discriminant measures for performance evaluation. In: Proceedings of the 19th Australian joint conference on artificial intelligence: advances in artificial intelligence, pp 1015–1021
- (2006) pp 1015–1021
- Sokolova, M.¹ Japkowicz, N.² Szpakowicz, S.³

119
- 0000629975
- Cross-validatory choice and assessment of statistical predictions (with discussion)
- Stone M (1974) Cross-validatory choice and assessment of statistical predictions (with discussion). J R Stat Soc Ser B 36:111–147
- (1974) J R Stat Soc Ser B , vol.36 , pp. 111-147
- Stone, M.¹

120
- 0017336301
- Asymptotics for and against cross-validation
- Stone M (1977) Asymptotics for and against cross-validation. Biometrika 64(1):29–35
- (1977) Biometrika , vol.64 , Issue.1 , pp. 29-35
- Stone, M.¹

121
- 67650706774
- Classification of imbalanced data: a review
- Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687
- (2009) Int J Pattern Recognit Artif Intell , vol.23 , Issue.4 , pp. 687
- Sun, Y.¹ Wong, A.K.² Kamel, M.S.³

122
- 25144439604
- Addison Wesley, Reading
- Tan P, Steinbach M, Kumar V (2006) Introduction to data mining. Addison Wesley, Reading
- (2006) Introduction to data mining
- Tan, P.¹ Steinbach, M.² Kumar, V.³

123
- 34748873053
- Multi-label classification: an overview
- Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Wareh Min 3(3):1–13
- (2007) Int J Data Wareh Min , vol.3 , Issue.3 , pp. 1-13
- Tsoumakas, G.¹ Katakis, I.²

124
- 0004217877
- Butterworth-Heinemann, Oxford
- van Rijsbergen CJ (1979) Information retrieval. Butterworth-Heinemann, Oxford
- (1979) Information retrieval
- van Rijsbergen, C.J.¹

125
- 0038797944
- 9, Wiley, New York
- Webb AR (2002) Statistical pattern recognition, vol 9, 2nd edn. Wiley, New York
- (2002) Statistical pattern recognition
- Webb, A.R.¹

126
- 0034247206
- Multiboosting: a technique for combining boosting and wagging
- Webb G (2000) Multiboosting: a technique for combining boosting and wagging. Mach Learn 40(2):159–196
- (2000) Mach Learn , vol.40 , Issue.2 , pp. 159-196
- Webb, G.¹

127
- 84945466542
- Estimating bias and variance from data, Tech. rep
- Webb GI, Conilione P (2003) Estimating bias and variance from data. Tech. rep
- (2003) Conilione P
- Webb, G.I.¹

128
- 20844458491
- Mining with rarity: a unifying framework
- Weiss GM (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newslett 6(1):7–19
- (2004) ACM SIGKDD Explor Newslett , vol.6 , Issue.1 , pp. 7-19
- Weiss, G.M.¹

129
- 0001884644
- Individual comparison by ranking methods
- Wilcoxon F (1945) Individual comparison by ranking methods. Biometrics 1(6):80–83
- (1945) Biometrics , vol.1 , Issue.6 , pp. 80-83
- Wilcoxon, F.¹

130
- 0000459353
- The lack of a priori distinctions between learning algorithms
- Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural Comput 8(7):1341–1390
- (1996) Neural Comput , vol.8 , Issue.7 , pp. 1341-1390
- Wolpert, D.H.¹

131
- 84857047820
- Iterative bias correction of the cross validation criterion
- Yanagihara H (2012) Iterative bias correction of the cross validation criterion. Scand J Stat 39(1):116–130
- (2012) Scand J Stat , vol.39 , Issue.1 , pp. 116-130
- Yanagihara, H.¹

132
- 0004232308
- Pearson Prentice Hall, Englewood Cliffs
- Zar JH (2010) Biostatistical analysis, 5th edn. Pearson Prentice Hall, Englewood Cliffs
- (2010) Biostatistical analysis
- Zar, J.H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.