메뉴 건너뛰기




Volumn 4, Issue 2, 2004, Pages 211-255

Tree induction vs. Logistic regression: A learning-curve analysis

Author keywords

Decision trees; Learning curves; Logistic regression; ROC analysis; Tree induction

Indexed keywords

DECISION TREES; LEARNING CURVES; LEARNING-CURVE ANALYSIS; LOGISTIC REGRESSION;

EID: 1242268938     PISSN: 15324435     EISSN: None     Source Type: Journal    
DOI: 10.1162/153244304322972694     Document Type: Article
Times cited : (283)

References (68)
  • 1
    • 0016355478 scopus 로고
    • A new look at statistical model identification
    • H. Akaike. A new look at statistical model identification. IEEE Transactions on Automatic Control, AU-19:716-722, 1974.
    • (1974) IEEE Transactions on Automatic Control , vol.AU-19 , pp. 716-722
    • Akaike, H.1
  • 2
    • 0032645080 scopus 로고    scopus 로고
    • An empirical comparison of voting classification algorithms: Bagging, boosting and variants
    • E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: bagging, boosting and variants. Machine Learning, 36:105-142, 1999.
    • (1999) Machine Learning , vol.36 , pp. 105-142
    • Bauer, E.1    Kohavi, R.2
  • 3
    • 0003408496 scopus 로고    scopus 로고
    • UCI repository of machine learning databases
    • University of California, Irvine
    • C. Blake and C.J. Merz. UCI repository of machine learning databases. Technical Report, University of California, Irvine. Available electronically at http://www.ics.uci.edu/mlearn/MLRepository.html, 2000.
    • (2000) Technical Report
    • Blake, C.1    Merz, C.J.2
  • 5
    • 0003100554 scopus 로고
    • Robustness in the strategy of scientific model building
    • eds. R.L. Launer and G.N. Wilkinson, Academic Press, New York
    • G.E.P. Box. Robustness in the strategy of scientific model building. In Robustness in Statistics, eds. R.L. Launer and G.N. Wilkinson, pages 201-236, Academic Press, New York, 1979.
    • (1979) Robustness in Statistics , pp. 201-236
    • Box, G.E.P.1
  • 7
    • 0031191630 scopus 로고    scopus 로고
    • The use of the area under the ROC curve in the evaluation of machine learning algorithms
    • A.P. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30:1145-1159, 1997.
    • (1997) Pattern Recognition , vol.30 , pp. 1145-1159
    • Bradley, A.P.1
  • 8
    • 2342654979 scopus 로고    scopus 로고
    • On the effect of data set size on bias and variance in classification learning
    • University of New South Wales
    • D. Brain and G. Webb. On the effect of data set size on bias and variance in classification learning. In Proceedings of the Fourth Australian Knowledge Acquisition Workshop, University of New South Wales, pages 117-128, 1999.
    • (1999) Proceedings of the Fourth Australian Knowledge Acquisition Workshop , pp. 117-128
    • Brain, D.1    Webb, G.2
  • 9
    • 0030211964 scopus 로고    scopus 로고
    • Bagging predictors
    • L. Breiman. Bagging predictors. Machine Learning, 24:123-140, 1996.
    • (1996) Machine Learning , vol.24 , pp. 123-140
    • Breiman, L.1
  • 11
    • 0003637516 scopus 로고
    • PhD thesis, School of Computer Science, University of Technology, Sydney, Australia
    • W. Buntine. A Theory of Learning Classification Rules. PhD thesis, School of Computer Science, University of Technology, Sydney, Australia, 1991.
    • (1991) A Theory of Learning Classification Rules
    • Buntine, W.1
  • 16
    • 85149612939 scopus 로고
    • Fast effective rule induction
    • eds. A. Prieditis and S. Russell, Lake Tahoe, California, Morgan Kaufmann
    • W.W. Cohen. Fast effective rule induction. Machine Learning: Proceedings of the Twelfth International Conference, eds. A. Prieditis and S. Russell, pages 115-123, Lake Tahoe, California, Morgan Kaufmann, 1995.
    • (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 115-123
    • Cohen, W.W.1
  • 19
    • 0042493398 scopus 로고    scopus 로고
    • Telecommunications network diagnosis
    • eds. W. Kloesgen and J. Zytkow, Oxford University Press, Oxford
    • A. Danyluk and F. Provost. Telecommunications network diagnosis. In Handbook of Knowledge Discovery and Data Mining, eds. W. Kloesgen and J. Zytkow, Oxford University Press, Oxford, pp. 897-902, 2002.
    • (2002) Handbook of Knowledge Discovery and Data Mining , pp. 897-902
    • Danyluk, A.1    Provost, F.2
  • 20
    • 0031269184 scopus 로고    scopus 로고
    • On the optimality of the simple Bayesian classifier under zero-one loss
    • P. Domingos and M. Pazzani. On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29:103-130, 1997.
    • (1997) Machine Learning , vol.29 , pp. 103-130
    • Domingos, P.1    Pazzani, M.2
  • 22
    • 0032533761 scopus 로고    scopus 로고
    • A comparison of statistical learning methods on the GUSTO database
    • M. Ennis, G. Hinton, D. Naylor, M. Revow, and R. Tibshirani. A comparison of statistical learning methods on the GUSTO database. Statist. Med. 17:2501-2508, 1998.
    • (1998) Statist. Med. , vol.17 , pp. 2501-2508
    • Ennis, M.1    Hinton, G.2    Naylor, D.3    Revow, M.4    Tibshirani, R.5
  • 23
    • 21344489373 scopus 로고
    • Error rates in quadratic discrimination with constraints on the covariance matrices
    • B.W. Flury and M.J. Schmid. Error rates in quadratic discrimination with constraints on the covariance matrices. Journal of Classification, 11:101-120, 1994.
    • (1994) Journal of Classification , vol.11 , pp. 101-120
    • Flury, B.W.1    Schmid, M.J.2
  • 24
    • 0346198098 scopus 로고    scopus 로고
    • The reputation quotient: A multi-stakeholder measure of corporate reputation
    • C.J. Fombrun, N. Gardberg, and J. Sever. The reputation quotient: a multi-stakeholder measure of corporate reputation. Journal of B rand Management, 7:241-255, 2000.
    • (2000) Journal of B Rand Management , vol.7 , pp. 241-255
    • Fombrun, C.J.1    Gardberg, N.2    Sever, J.3
  • 25
    • 21744462998 scopus 로고    scopus 로고
    • On bias, variance, 0/1-loss, and the curse of dimensionality
    • J.H. Friedman. On bias, variance, 0/1-loss, and the curse of dimensionality. Data Mining and Knowledge Discovery, 1:55-77, 1997.
    • (1997) Data Mining and Knowledge Discovery , vol.1 , pp. 55-77
    • Friedman, J.H.1
  • 28
    • 0020083498 scopus 로고
    • The meaning and use of the area under a receiver operating characteristic (ROC) curve
    • J.A. Hanley and B.J. McNeil. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143:29-36, 1982.
    • (1982) Radiology , vol.143 , pp. 29-36
    • Hanley, J.A.1    McNeil, B.J.2
  • 29
    • 0037550836 scopus 로고    scopus 로고
    • Sample size and misclassification: Is more always better?
    • AMS Center for Advanced Technologies
    • C. Harris-Jones and T.L. Haines. Sample size and misclassification: is more always better? AMSCAT-WP-97-118, AMS Center for Advanced Technologies, 1997.
    • (1997) AMSCAT-WP-97-118
    • Harris-Jones, C.1    Haines, T.L.2
  • 31
    • 0030282940 scopus 로고    scopus 로고
    • Rigorous learning curve bounds from statistical mechanics
    • D. Haussler, M. Kearns, H.S. Seung, and N. Tishby. Rigorous learning curve bounds from statistical mechanics. Machine Learning 25:195-236, 1996.
    • (1996) Machine Learning , vol.25 , pp. 195-236
    • Haussler, D.1    Kearns, M.2    Seung, H.S.3    Tishby, N.4
  • 32
    • 84942484786 scopus 로고
    • Ridge regression: Biased estimates for nonorthogonal problems
    • A.E. Hoerl and R.W. Kennard. Ridge regression: biased estimates for nonorthogonal problems. Technometrics, 12:55-67, 1970.
    • (1970) Technometrics , vol.12 , pp. 55-67
    • Hoerl, A.E.1    Kennard, R.W.2
  • 36
    • 0042685161 scopus 로고    scopus 로고
    • Bayesian logistic regression: A variational approach
    • T.S. Jaakkola and M.I. Jordan. Bayesian logistic regression: a variational approach. Statistics and Computing, 10:25-37, 2000.
    • (2000) Statistics and Computing , vol.10 , pp. 25-37
    • Jaakkola, T.S.1    Jordan, M.I.2
  • 38
    • 0029306995 scopus 로고
    • STATLOG: Comparison of classification algorithms on large real-world problems
    • R. D. King, C. Feng, and A. Sutherland. STATLOG: comparison of classification algorithms on large real-world problems. Applied Artificial Intelligence, 9(3):289-334, 1995.
    • (1995) Applied Artificial Intelligence , vol.9 , Issue.3 , pp. 289-334
    • King, R.D.1    Feng, C.2    Sutherland, A.3
  • 41
    • 0018195038 scopus 로고
    • Efficient screening of nonnormal regression models
    • J.F. Lawless and K. Singhai. Efficient screening of nonnormal regression models. Biometrics, 34:318-327, 1978.
    • (1978) Biometrics , vol.34 , pp. 318-327
    • Lawless, J.F.1    Singhai, K.2
  • 44
    • 0034274591 scopus 로고    scopus 로고
    • A comparison of prediction accuracy, complexity, and training time for thirty-three old and new classification algorithms
    • T.S. Lim, W. Y. Loh, and Y.S. Shih. A comparison of prediction accuracy, complexity, and training time for thirty-three old and new classification algorithms. Machine Learning, 40:203-228, 2000.
    • (2000) Machine Learning , vol.40 , pp. 203-228
    • Lim, T.S.1    Loh, W.Y.2    Shih, Y.S.3
  • 47
    • 59549087165 scopus 로고    scopus 로고
    • On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes
    • A. Ng and M. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In Advances in Neural Information Processing Systems (NIPS-2001) 14: 841-848, 2001.
    • (2001) Advances in Neural Information Processing Systems (NIPS-2001) , vol.14 , pp. 841-848
    • Ng, A.1    Jordan, M.2
  • 48
    • 0002125069 scopus 로고    scopus 로고
    • The effects of training set size on decision tree complexity
    • ed. D. Fisher, Morgan Kaufmann, San Mateo, California
    • T. Oates and D. Jensen. The effects of training set size on decision tree complexity. In Machine Learning: Proceedings of the Fourteenth International Conference, ed. D. Fisher, pages 254-262, Morgan Kaufmann, San Mateo, California, 1997.
    • (1997) Machine Learning: Proceedings of the Fourteenth International Conference , pp. 254-262
    • Oates, T.1    Jensen, D.2
  • 50
    • 2342496307 scopus 로고    scopus 로고
    • Tree induction for probability-based rankings
    • forthcoming
    • F. Provost and P. Domingos. Tree induction for probability-based rankings. Machine Learning, 52:3, forthcoming.
    • Machine Learning , vol.52 , pp. 3
    • Provost, F.1    Domingos, P.2
  • 52
    • 0035283313 scopus 로고    scopus 로고
    • Robust classification for imprecise environments
    • F. Provost and T. Fawcett. Robust classification for imprecise environments. Machine Learning, 42:203-231, 2001.
    • (2001) Machine Learning , vol.42 , pp. 203-231
    • Provost, F.1    Fawcett, T.2
  • 55
    • 0141771188 scopus 로고    scopus 로고
    • A survey of methods for scaling up inductive algorithms
    • F. Provost and V. Kolluri. A survey of methods for scaling up inductive algorithms. Data Mining and Knowledge Discovery, 3:131-169, 1999.
    • (1999) Data Mining and Knowledge Discovery , vol.3 , pp. 131-169
    • Provost, F.1    Kolluri, V.2
  • 57
    • 0004282518 scopus 로고    scopus 로고
    • SAS Publishing, Gary, North Carolina
    • SAS Institute. SAS/STAT User's Guide, Version 8, SAS Publishing, Gary, North Carolina, 2000.
    • (2000) SAS/STAT User's Guide, Version 8
  • 58
    • 0000120766 scopus 로고
    • Estimating the dimension of a model
    • G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6:461-464, 1978.
    • (1978) Annals of Statistics , vol.6 , pp. 461-464
    • Schwarz, G.1
  • 59
    • 0026119038 scopus 로고
    • Symbolic and neural learning algorithms: An experimental comparison
    • J.W. Shavlik, R.J. Mooney, and G.G. Towell. Symbolic and neural learning algorithms: an experimental comparison. Machine Learning, 6:111-143, 1991.
    • (1991) Machine Learning , vol.6 , pp. 111-143
    • Shavlik, J.W.1    Mooney, R.J.2    Towell, G.G.3
  • 60
    • 84862340556 scopus 로고
    • A tutorial on logistic regression
    • Y. So. A tutorial on logistic regression. Technical Note 450. Available electronically at http://www.sas.com/service/techsup/tnote/tnote_index4.html, 1995.
    • (1995) Technical Note , vol.450
    • So, Y.1
  • 61
    • 0004041987 scopus 로고    scopus 로고
    • Moody's public firm risk model: A hybrid approach to modeling short term default risk
    • Moody's Investors Service, Global Credit Research
    • J.R. Sobehart, R.M. Stein, V. Mikityanskaya, and L. Li. Moody's public firm risk model: a hybrid approach to modeling short term default risk. Technical Report, Moody's Investors Service, Global Credit Research. Available electronically at http://www.moodysqra.com/researoh/crm/53853.asp, 2000.
    • (2000) Technical Report
    • Sobehart, J.R.1    Stein, R.M.2    Mikityanskaya, V.3    Li, L.4
  • 63
    • 0023890867 scopus 로고
    • Measuring the accuracy of diagnostic systems
    • J. Swets. Measuring the accuracy of diagnostic systems. Science, 240:1285-1293, 1988.
    • (1988) Science , vol.240 , pp. 1285-1293
    • Swets, J.1
  • 66
    • 0033423903 scopus 로고    scopus 로고
    • A self-affirmation analysis of survivors' reactions to unfair organizational downsizings
    • B.M. Wiesenfeld, J. Brockner, and C. Martin. A self-affirmation analysis of survivors' reactions to unfair organizational downsizings. Journal of Experimental Social Psychology, 35:441-460, 1999.
    • (1999) Journal of Experimental Social Psychology , vol.35 , pp. 441-460
    • Wiesenfeld, B.M.1    Brockner, J.2    Martin, C.3
  • 68
    • 0003259364 scopus 로고    scopus 로고
    • Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
    • eds. C. Brodley and A. Danyluk, Morgan Kaufmann, San Mateo, California
    • B. Zadrozny and C. Elkan. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML-01), eds. C. Brodley and A. Danyluk, pages 609-616, Morgan Kaufmann, San Mateo, California, 2001.
    • (2001) Proceedings of the Eighteenth International Conference on Machine Learning (ICML-01) , pp. 609-616
    • Zadrozny, B.1    Elkan, C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.