메뉴 건너뛰기




Volumn 17, Issue 3, 2008, Pages 611-628

A bias correction algorithm for the gini variable importance measure in classification trees

Author keywords

Bias; Learning ensemble; Variable importance; Variable selection

Indexed keywords


EID: 53549131556     PISSN: 10618600     EISSN: None     Source Type: Journal    
DOI: 10.1198/106186008X344522     Document Type: Article
Times cited : (134)

References (34)
  • 1
    • 0034324043 scopus 로고    scopus 로고
    • A Formalism for Relevance and its Application in Feature Subset Selection
    • Bell, D. and Wang, H. (2000): "A Formalism for Relevance and its Application in Feature Subset Selection," Machine Learning, 4, 2, 175-195.
    • (2000) Machine Learning , vol.4 , Issue.2 , pp. 175-195
    • Bell, D.1    Wang, H.2
  • 2
    • 0035478854 scopus 로고    scopus 로고
    • Random Forests
    • Breiman, L. (2001a), "Random Forests," Machine Learning, 45, 5-32.
    • (2001) Machine Learning , vol.45 , pp. 5-32
    • Breiman, L.1
  • 3
    • 0000245743 scopus 로고    scopus 로고
    • Statistical Modeling: The Two Cultures
    • (2001b), "Statistical Modeling: The Two Cultures," Statistical Science, 16, 3, 199-231.
    • Statistical Science , vol.16 , Issue.3 , pp. 199-231
  • 4
    • 0011996706 scopus 로고    scopus 로고
    • Manual on Setting Up, Using, and Understanding Random Forests v3.1
    • Technical Report
    • (2002), "Manual on Setting Up, Using, and Understanding Random Forests v3.1." Technical Report, ftp://ftp.stat.berkeley.edu/pub/users/ breiman/Using-random-forests-v3.1.pdf.
  • 6
    • 53549120786 scopus 로고    scopus 로고
    • Breiman, L, Cutler, A, Liaw, A, and Wiener, M, 2006, Breiman and Cutler's Random Forests for Classification and Regression. R package version 4.5-18
    • Breiman, L., Cutler, A., Liaw, A., and Wiener, M. (2006), Breiman and Cutler's Random Forests for Classification and Regression. R package version 4.5-18 http://cran.r-project.org/doc/packages/randomForest.pdf.
  • 7
    • 34248632806 scopus 로고    scopus 로고
    • Bureau, A., Dupuis, J., Hayward, B., Falls, K., and Van Eerdewegh, P. (2003), Mapping Complex Traits using Random Forests, BMC Genetics, 4(Suppl. 1): S64, http://www.biomedcentral.eom/1471-2156/4/s1/ S64.
    • Bureau, A., Dupuis, J., Hayward, B., Falls, K., and Van Eerdewegh, P. (2003), "Mapping Complex Traits using Random Forests," BMC Genetics, 4(Suppl. 1): S64, http://www.biomedcentral.eom/1471-2156/4/s1/ S64.
  • 8
    • 34247115449 scopus 로고    scopus 로고
    • Boosted Trees for Ecological Modeling and Prediction
    • De'ath, G. (2007), "Boosted Trees for Ecological Modeling and Prediction," Ecology, 88, 1, 243-251.
    • (2007) Ecology , vol.88 , Issue.1 , pp. 243-251
    • De'ath, G.1
  • 9
    • 30644464444 scopus 로고    scopus 로고
    • Gene Selection and Classification of Microarray Data using Random Forest
    • Diaz-Uriarte, R., and Alvarez de Andrés, S. (2006), "Gene Selection and Classification of Microarray Data using Random Forest," BMC Genetics, 7:3, http://www.biomedcentral.com/1471-2105/7/3.
    • (2006) BMC Genetics , vol.7 , pp. 3
    • Diaz-Uriarte, R.1    Alvarez de Andrés, S.2
  • 10
    • 1842692307 scopus 로고    scopus 로고
    • Bias Correction in Classification Tree Construction
    • eds. C.E. Brodley and A. P. Danyluk, Williams College, Williamstown, MA
    • Dobra, A., and Gehrke, J. (2001), "Bias Correction in Classification Tree Construction," in Proceedings of the Seventeenth International Conference on Machine Learning, eds. C.E. Brodley and A. P. Danyluk, Williams College, Williamstown, MA, 90-97.
    • (2001) Proceedings of the Seventeenth International Conference on Machine Learning , pp. 90-97
    • Dobra, A.1    Gehrke, J.2
  • 11
    • 0035470889 scopus 로고    scopus 로고
    • Greedy Function Approximation: A Gradient Boosting Machine
    • Friedman, J.H. (2001), "Greedy Function Approximation: A Gradient Boosting Machine," The Annals of Statistics, 29, 1189-1232.
    • (2001) The Annals of Statistics , vol.29 , pp. 1189-1232
    • Friedman, J.H.1
  • 12
    • 53549127446 scopus 로고    scopus 로고
    • Tutorial: Getting Started with MART in R,
    • Technical Report, Standford University. Available online at
    • (2002), "Tutorial: Getting Started with MART in R," Technical Report, Standford University. Available online at http://www-stat.stanford. edu/∼jhf/r-mart/tutorial/tutorial.pdf.
  • 13
    • 0038702163 scopus 로고    scopus 로고
    • Multiple Additive Regression Trees with Application in Epidemiology
    • Friedman, J.H., and Meulman, J.J. (2003), "Multiple Additive Regression Trees with Application in Epidemiology," Statistics in Medicine, 22, 1365-1381.
    • (2003) Statistics in Medicine , vol.22 , pp. 1365-1381
    • Friedman, J.H.1    Meulman, J.J.2
  • 14
    • 0001552995 scopus 로고
    • Cross-Validation, the Jackknife, and the Bootstrap: Excess Error Estimation in Forward Logistic Regression
    • Gong, G. (1986), "Cross-Validation, the Jackknife, and the Bootstrap: Excess Error Estimation in Forward Logistic Regression," Journal of the American Statistical Association, 81, 108-113.
    • (1986) Journal of the American Statistical Association , vol.81 , pp. 108-113
    • Gong, G.1
  • 16
    • 10044227497 scopus 로고    scopus 로고
    • Development of Linear, Ensemble, and Nonlinear Models for the Prediction and Interpretation of the Biological Activity of a Set of PDGFR Inhibitors
    • Guha, R., and Jurs, P.C. (2004), "Development of Linear, Ensemble, and Nonlinear Models for the Prediction and Interpretation of the Biological Activity of a Set of PDGFR Inhibitors," Journal of Chemical Inference and Computer Science, 44, 2179-2189.
    • (2004) Journal of Chemical Inference and Computer Science , vol.44 , pp. 2179-2189
    • Guha, R.1    Jurs, P.C.2
  • 18
    • 53549091184 scopus 로고    scopus 로고
    • Hothorn, T, Hornik, K, and Zeileis, A, 2006b, party: A Laboratory for Recursive Partitioning, R package version 0.9-11. Available online at
    • Hothorn, T., Hornik, K., and Zeileis, A. (2006b), "party: A Laboratory for Recursive Partitioning," R package version 0.9-11. Available online at http://cran.r-project.org/doc/vignettes/party/party.pdf.
  • 19
    • 85099325734 scopus 로고
    • Irrelevant Features and the Subset Selection Problem
    • eds. W. W. Cohen and H. Hirsch, New Brunswick, NJ: Morgan Kaufmann
    • John, G.H., Kohavi, R., and Pfleger, K. (1994), "Irrelevant Features and the Subset Selection Problem," in Proceedings of the llth International Conference on Machine Learning, eds. W. W. Cohen and H. Hirsch, New Brunswick, NJ: Morgan Kaufmann, 121-129.
    • (1994) Proceedings of the llth International Conference on Machine Learning , pp. 121-129
    • John, G.H.1    Kohavi, R.2    Pfleger, K.3
  • 22
    • 0031312210 scopus 로고    scopus 로고
    • Split Selection Methods for Classification Trees
    • Loh, W.-Y., Shih, Y.-S. (1997), "Split Selection Methods for Classification Trees," Statististica Sinica, 7, 815-840.
    • (1997) Statististica Sinica , vol.7 , pp. 815-840
    • Loh, W.-Y.1    Shih, Y.-S.2
  • 23
    • 25444453244 scopus 로고    scopus 로고
    • Screening Large-Scale Association Study Data: Exploiting Interactions using Random Forests
    • Available online at
    • Lunetta, K.L., Hayward, B.L., Segal, J., and Van Eerdewegh, P. (2004), "Screening Large-Scale Association Study Data: Exploiting Interactions using Random Forests," BMC Genetics, 5, 32. Available online at http://www.biomedcentral.com/1471-2156/5/32.
    • (2004) BMC Genetics , vol.5 , pp. 32
    • Lunetta, K.L.1    Hayward, B.L.2    Segal, J.3    Van Eerdewegh, P.4
  • 24
    • 33847236254 scopus 로고    scopus 로고
    • Multivariate Feature Selection and Hierarchical Classification for Infrared Spectroscopy: Serum-Based Detection of Bovine Spongiform Encepbalopathy
    • doi:10.1007/s00216-006-1070-5
    • Menze, B.H., Petrich, W., and Hamprecht F.A. (2007), "Multivariate Feature Selection and Hierarchical Classification for Infrared Spectroscopy: Serum-Based Detection of Bovine Spongiform Encepbalopathy," Analytical and Bioanalytical Chemistry, doi:10.1007/s00216-006-1070-5.
    • (2007) Analytical and Bioanalytical Chemistry
    • Menze, B.H.1    Petrich, W.2    Hamprecht, F.A.3
  • 27
    • 33646517317 scopus 로고    scopus 로고
    • Boosted Regression (Boosting): A Tutorial and a Stata Plugin
    • Schonlau, M. (2005), "Boosted Regression (Boosting): A Tutorial and a Stata Plugin," The Stata Journal, 5, 3, 330-354.
    • (2005) The Stata Journal , vol.5 , Issue.3 , pp. 330-354
    • Schonlau, M.1
  • 28
    • 53549101527 scopus 로고    scopus 로고
    • Statistical Sources of Variable Selection Bias in Classification Trees Based on the Gini Index,
    • Technical Report, SFB 386
    • Strobl, C. (2005), "Statistical Sources of Variable Selection Bias in Classification Trees Based on the Gini Index," Technical Report, SFB 386, http://epub.ub.uni-muenchen.de/archive/00001789/01/paper.420.pdf.
    • (2005)
    • Strobl, C.1
  • 29
    • 33847096395 scopus 로고    scopus 로고
    • Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution
    • doi:10.1186/1471-2105- 8-25
    • Strobl, C., Boulesteix, A.-L., Zeileis, A., and Hothorn, T. (2007a), "Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution," BMC Bioinformatics, 8, 25, doi:10.1186/1471-2105- 8-25.
    • (2007) BMC Bioinformatics , vol.8 , pp. 25
    • Strobl, C.1    Boulesteix, A.-L.2    Zeileis, A.3    Hothorn, T.4
  • 30
    • 34548250123 scopus 로고    scopus 로고
    • Unbiased Split Selection for Classification Trees Based on the Gini Index
    • doi:10.1016/j.csda.2006.12. 030
    • Strobl, C., Boulesteix, A.-L., and Augustin, T. (2007b), "Unbiased Split Selection for Classification Trees Based on the Gini Index," Computational Statistics & Data Analysis, doi:10.1016/j.csda.2006.12. 030.
    • (2007) Computational Statistics & Data Analysis
    • Strobl, C.1    Boulesteix, A.-L.2    Augustin, T.3
  • 32
    • 0028443213 scopus 로고
    • Bias in Information-Based Measures in Decision Tree Induction
    • White, A.P., and Liu, W.Z. (1994), "Bias in Information-Based Measures in Decision Tree Induction," Machine Learning, 15, 321-329.
    • (1994) Machine Learning , vol.15 , pp. 321-329
    • White, A.P.1    Liu, W.Z.2
  • 34
    • 25144492516 scopus 로고    scopus 로고
    • Efficient Feature Selection via Analysis of Relevance and Redundancy
    • Yu, L., and Liu, H. (2004), "Efficient Feature Selection via Analysis of Relevance and Redundancy," Journal of Machine Learning Research, 5, 1205-1224.
    • (2004) Journal of Machine Learning Research , vol.5 , pp. 1205-1224
    • Yu, L.1    Liu, H.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.