메뉴 건너뛰기




Volumn 20, Issue 4, 2010, Pages 393-407

Analysis and correction of bias in Total Decrease in Node Impurity measures for tree-based algorithms

Author keywords

Ensemble learning; Impurity measures; Variable importance

Indexed keywords


EID: 77956880851     PISSN: 09603174     EISSN: None     Source Type: Journal    
DOI: 10.1007/s11222-009-9132-0     Document Type: Article
Times cited : (43)

References (32)
  • 1
    • 0034324043 scopus 로고    scopus 로고
    • A formalism for relevance and its application in feature subset selection
    • Bell, D., Wang, H.: A formalism for relevance and its application in feature subset selection. Mach. Learn. 4(2), 175-195 (2000).
    • (2000) Mach. Learn. , vol.4 , Issue.2 , pp. 175-195
    • Bell, D.1    Wang, H.2
  • 2
    • 31944452324 scopus 로고    scopus 로고
    • An introduction to ensemble methods for data analysis
    • Berk, R. A.: An introduction to ensemble methods for data analysis. Sociol. Methods Res. 34(3), 263-295 (2006).
    • (2006) Sociol. Methods Res. , vol.34 , Issue.3 , pp. 263-295
    • Berk, R.A.1
  • 3
    • 0030344230 scopus 로고    scopus 로고
    • The heuristic of instability in model selection
    • Breiman, L.: The heuristic of instability in model selection. Ann. Stat. 24, 2350-2383 (1996).
    • (1996) Ann. Stat. , vol.24 , pp. 2350-2383
    • Breiman, L.1
  • 4
    • 0035478854 scopus 로고    scopus 로고
    • Random Forests
    • Breiman, L.: Random Forests. Mach. Learn. 45, 5-32 (2001a).
    • (2001) Mach. Learn. , vol.45 , pp. 5-32
    • Breiman, L.1
  • 5
    • 0000245743 scopus 로고    scopus 로고
    • Statistical modeling: the two cultures
    • Breiman, L.: Statistical modeling: the two cultures. Stat. Sci. 16, 199-231 (2001b).
    • (2001) Stat. Sci. , vol.16 , pp. 199-231
    • Breiman, L.1
  • 8
    • 79958810091 scopus 로고    scopus 로고
    • Breiman and Cutler's Random Forests for classification and regression
    • Breiman, L., Cutler, A., Liaw, A., Wiener, M.: Breiman and Cutler's Random Forests for classification and regression. R package version 4. 5-18 (2006). http://cran. r-project. org/doc/packages/randomForest. pdf.
    • (2006) R package version , vol.4 , pp. 5-18
    • Breiman, L.1    Cutler, A.2    Liaw, A.3    Wiener, M.4
  • 9
    • 0043289776 scopus 로고    scopus 로고
    • Analyzing bagging
    • Bühlmann, P., Yu, B.: Analyzing bagging. Ann. Stat. 30(4), 927-961 (2002).
    • (2002) Ann. Stat. , vol.30 , Issue.4 , pp. 927-961
    • Bühlmann, P.1    Yu, B.2
  • 10
    • 77956870347 scopus 로고    scopus 로고
    • Dobra, A., Gehrke, J.: Bias correction in classification tree construction. In: Brodley, C. E., Danyluk, A. P. (eds.) Proceedings of the Seventeenth International Conference on Machine Learning, Williams College, Williamstown, MA, USA, pp. 90-97 (2001).
  • 11
    • 0035470889 scopus 로고    scopus 로고
    • Greedy function approximation: a gradient boosting machine
    • Friedman, J. H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189-1232 (2001).
    • (2001) Ann. Stat. , vol.29 , pp. 1189-1232
    • Friedman, J.H.1
  • 12
    • 77956355578 scopus 로고    scopus 로고
    • Technical report, Standford University
    • Friedman, J. H.: Tutorial: getting started with MART in R. Technical report, Standford University (2002). http://www-stat. stanford. edu/~jhf/r-mart/tutorial/tutorial. pdf.
    • (2002) Tutorial: Getting started with MART in R
    • Friedman, J.H.1
  • 13
    • 33749677657 scopus 로고    scopus 로고
    • Unbiased recursive partitioning: a conditional inference framework
    • Hothorn, T., Hornik, K., Zeileis, A.: Unbiased recursive partitioning: a conditional inference framework. J. Comput. Graph. Stat. 15(3), 651-674 (2006).
    • (2006) J. Comput. Graph. Stat. , vol.15 , Issue.3 , pp. 651-674
    • Hothorn, T.1    Hornik, K.2    Zeileis, A.3
  • 14
    • 1542573450 scopus 로고    scopus 로고
    • Classification trees with unbiased multiway splits
    • Kim, H., Loh, W.: Classification trees with unbiased multiway splits. J. Am. Stat. Assoc. 96, 589-604 (2001).
    • (2001) J. Am. Stat. Assoc. , vol.96 , pp. 589-604
    • Kim, H.1    Loh, W.2
  • 15
    • 77956885021 scopus 로고    scopus 로고
    • Kononenko, I.: On biases in estimating multi-valued attributes. In: Mellish, C. (ed.) Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, Montréal, Canada, pp. 1034-1040 (1995).
  • 16
    • 0345040873 scopus 로고    scopus 로고
    • Classification and regression by Random Forest
    • Liaw, A., Wiener, M.: Classification and regression by Random Forest. R News 2(3), 18-22 (2002).
    • (2002) R News , vol.2 , Issue.3 , pp. 18-22
    • Liaw, A.1    Wiener, M.2
  • 17
    • 0031312210 scopus 로고    scopus 로고
    • Split selection methods for classification trees
    • Loh, W.-Y., Shih, Y.-S.: Split selection methods for classification trees. Stat. Sinica 7, 815-840 (1997).
    • (1997) Stat. Sinica , vol.7 , pp. 815-840
    • Loh, W.-Y.1    Shih, Y.-S.2
  • 18
    • 77956884378 scopus 로고    scopus 로고
    • Automatic construction of decision trees from data: a multi-disciplinary survey
    • Murthy, K.: Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min. Knowl. Discov. 2(4), 1384-5810 (2004).
    • (2004) Data Min. Knowl. Discov. , vol.2 , Issue.4 , pp. 1384-5810
    • Murthy, K.1
  • 21
    • 84907095419 scopus 로고    scopus 로고
    • R: A language and environment for statistical computing
    • R Development Core Team, Vienna, Austria. ISBN 3-900051-07-0
    • R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www. R-project. org. (2008).
    • (2008) R Foundation for Statistical Computing
  • 23
    • 53549131556 scopus 로고    scopus 로고
    • A bias correction algorithm for the Gini variable importance measure in classification trees
    • Sandri, M., Zuccolotto, P.: A bias correction algorithm for the Gini variable importance measure in classification trees. J. Comput. Graph. Stat. 17(3), 1-18 (2008).
    • (2008) J. Comput. Graph. Stat. , vol.17 , Issue.3 , pp. 1-18
    • Sandri, M.1    Zuccolotto, P.2
  • 24
    • 33646517317 scopus 로고    scopus 로고
    • Boosted regression (boosting): a tutorial and a stata plugin
    • Schonlau, M.: Boosted regression (boosting): a tutorial and a stata plugin. Stata J. 5(3), 330-354 (2005).
    • (2005) Stata J. , vol.5 , Issue.3 , pp. 330-354
    • Schonlau, M.1
  • 25
    • 0042942219 scopus 로고    scopus 로고
    • Families of splitting criteria for classification trees
    • Shih, Y.-S.: Families of splitting criteria for classification trees. Stat. Comput. 9, 309-315 (1999).
    • (1999) Stat. Comput. , vol.9 , pp. 309-315
    • Shih, Y.-S.1
  • 27
    • 34548250123 scopus 로고    scopus 로고
    • Unbiased split selection for classification trees based on the Gini index
    • doi:10.1016/j.csda.2006.12.030
    • Strobl, C., Boulesteix, A.-L., Augustin, T.: Unbiased split selection for classification trees based on the Gini index. Comput. Stat. Data Anal. (2007a). doi: 10. 1016/j. csda. 2006. 12. 030.
    • (2007) Comput. Stat. Data Anal.
    • Strobl, C.1    Boulesteix, A.-L.2    Augustin, T.3
  • 28
    • 33847096395 scopus 로고    scopus 로고
    • Bias in random forest variable importance measures: illustrations, sources and a solution
    • doi:10.1186/1471-2105-8-25
    • Strobl, C., Boulesteix, A.-L., Zeileis, A., Hothorn, T.: Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinf. 8, 25 (2007b). doi: 10. 1186/1471-2105-8-25.
    • (2007) BMC Bioinf. , vol.8 , pp. 25
    • Strobl, C.1    Boulesteix, A.-L.2    Zeileis, A.3    Hothorn, T.4
  • 29
    • 48549095457 scopus 로고    scopus 로고
    • Conditional variable importance for random forests
    • doi:10.1186/1471-2105-9-307
    • Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinf. 9, 307 (2008). doi: 10. 1186/1471-2105-9-307.
    • (2008) BMC Bioinf. , vol.9 , pp. 307
    • Strobl, C.1    Boulesteix, A.-L.2    Kneib, T.3    Augustin, T.4    Zeileis, A.5
  • 30
    • 33645861937 scopus 로고    scopus 로고
    • Statistical inference for variable importance
    • van der Laan, M. J.: Statistical inference for variable importance. Int. J. Biostat. 2(1), 1-30 (2005).
    • (2005) Int. J. Biostat. , vol.2 , Issue.1 , pp. 1-30
    • van der Laan, M.J.1
  • 31
    • 0028443213 scopus 로고
    • Bias in information-based measures in decision tree induction
    • White, A. P., Liu, W. Z.: Bias in information-based measures in decision tree induction. Mach. Learn. 15, 321-329 (1994).
    • (1994) Mach. Learn. , vol.15 , pp. 321-329
    • White, A.P.1    Liu, W.Z.2
  • 32
    • 33947248175 scopus 로고    scopus 로고
    • Controlling variable selection by the addition of pseudovariables
    • Wu, Y., Boos, D. D., Stefanski, L. A.: Controlling variable selection by the addition of pseudovariables. J. Am. Stat. Assoc. 102(477), 235-243 (2007).
    • (2007) J. Am. Stat. Assoc. , vol.102 , Issue.477 , pp. 235-243
    • Wu, Y.1    Boos, D.D.2    Stefanski, L.A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.