메뉴 건너뛰기




Volumn 4, Issue 1, 2014, Pages 55-63

Mining data with random forests: Current options for real-world applications

Author keywords

[No Author keywords available]

Indexed keywords

BACKWARD ELIMINATION; CLASSIFICATION AND REGRESSION TREE; CORRELATION BETWEEN FEATURES; MINING HIGH-DIMENSIONAL DATA; PROBABILITY ESTIMATION; RANDOM SELECTION; ROBUST APPROACHES; VARIABLE IMPORTANCES;

EID: 84890868650     PISSN: 19424787     EISSN: 19424795     Source Type: Journal    
DOI: 10.1002/widm.1114     Document Type: Article
Times cited : (202)

References (48)
  • 2
    • 84856275943 scopus 로고    scopus 로고
    • Classification and regression trees
    • Loh W-Y. Classification and regression trees. WIREs Data Mining Knowl Discov 2011, 1:14-23.
    • (2011) WIREs Data Mining Knowl Discov , vol.1 , pp. 14-23
    • Loh, W.-Y.1
  • 3
    • 0021875130 scopus 로고
    • Tree-structured survival analysis
    • Gordon L, Olshen RA. Tree-structured survival analysis. Cancer Treat Rep 1985, 69:1065-1069.
    • (1985) Cancer Treat Rep , vol.69 , pp. 1065-1069
    • Gordon, L.1    Olshen, R.A.2
  • 4
    • 0035478854 scopus 로고    scopus 로고
    • Random forests
    • Breiman L. Random forests. Mach Learn 2001, 45:5-32.
    • (2001) Mach Learn , vol.45 , pp. 5-32
    • Breiman, L.1
  • 5
    • 0030211964 scopus 로고    scopus 로고
    • Bagging predictors
    • Breiman L. Bagging predictors. Mach Learn 1996, 24:123-140.
    • (1996) Mach Learn , vol.24 , pp. 123-140
    • Breiman, L.1
  • 8
    • 84890875937 scopus 로고    scopus 로고
    • Probability estimation with machine learning methods for dichotomous and multi-category outcome: theory
    • Kruppa J, Liu Y, Biau G, Kohler M, König IR, Malley JD, Ziegler A. Probability estimation with machine learning methods for dichotomous and multi-category outcome: theory. Biom J, In press.
    • Biom J,
    • Kruppa, J.1    Liu, Y.2    Biau, G.3    Kohler, M.4    König, I.R.5    Malley, J.D.6    Ziegler, A.7
  • 9
    • 77949388276 scopus 로고    scopus 로고
    • The behaviour of random forest permutation-based variable importance measures under predictor correlation
    • Nicodemus KK, Malley JD, Strobl C, Ziegler A. The behaviour of random forest permutation-based variable importance measures under predictor correlation. BMC Bioinformatics 2010, 11:110.
    • (2010) BMC Bioinformatics , vol.11 , pp. 110
    • Nicodemus, K.K.1    Malley, J.D.2    Strobl, C.3    Ziegler, A.4
  • 10
    • 84873187093 scopus 로고    scopus 로고
    • Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics
    • Boulesteix A-L, Janitza S, Kruppa J, König IR. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. WIREs Data Mining Knowl Discov 2012, 2:493-507.
    • (2012) WIREs Data Mining Knowl Discov , vol.2 , pp. 493-507
    • Boulesteix, A.-L.1    Janitza, S.2    Kruppa, J.3    König, I.R.4
  • 11
    • 80052880887 scopus 로고    scopus 로고
    • Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression
    • Chen CC, Schwender H, Keith J, Nunkesser R, Mengersen K, Macrossan P. Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression. IEEE/ACM Trans Comput Biol Bioinform 2011, 8:1580-1591.
    • (2011) IEEE/ACM Trans Comput Biol Bioinform , vol.8 , pp. 1580-1591
    • Chen, C.C.1    Schwender, H.2    Keith, J.3    Nunkesser, R.4    Mengersen, K.5    Macrossan, P.6
  • 12
    • 84861730860 scopus 로고    scopus 로고
    • Random forests for genomic data analysis
    • Chen X, Ishwaran H. Random forests for genomic data analysis. Genomics 2012, 99:323-329.
    • (2012) Genomics , vol.99 , pp. 323-329
    • Chen, X.1    Ishwaran, H.2
  • 13
    • 33747891439 scopus 로고    scopus 로고
    • Random forests for microarrays
    • Cutler A, Stevens JR. Random forests for microarrays. Meth Enzymol 2006, 411:422-432.
    • (2006) Meth Enzymol , vol.411 , pp. 422-432
    • Cutler, A.1    Stevens, J.R.2
  • 14
    • 77958469133 scopus 로고    scopus 로고
    • Multigenic modeling of complex disease by random forests
    • Sun YV. Multigenic modeling of complex disease by random forests. Adv Genet 2010, 72:73-99.
    • (2010) Adv Genet , vol.72 , pp. 73-99
    • Sun, Y.V.1
  • 15
    • 72449170109 scopus 로고    scopus 로고
    • An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests
    • Strobl C, Malley J, Tutz G. An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 2009, 14:323-348.
    • (2009) Psychol Methods , vol.14 , pp. 323-348
    • Strobl, C.1    Malley, J.2    Tutz, G.3
  • 17
    • 77958064179 scopus 로고    scopus 로고
    • Mining data with random forests: a survey and results of new tests
    • Verikas A, Gelzinis A, Bacauskiene M. Mining data with random forests: a survey and results of new tests. Pattern Recogn 2011, 44:330-349.
    • (2011) Pattern Recogn , vol.44 , pp. 330-349
    • Verikas, A.1    Gelzinis, A.2    Bacauskiene, M.3
  • 18
    • 84872386141 scopus 로고    scopus 로고
    • Bias of the random forest out-of-bag (OOB) error for certain input parameters
    • Mitchell MW. Bias of the random forest out-of-bag (OOB) error for certain input parameters. Open J Statist 2011, 1:205-211.
    • (2011) Open J Statist , vol.1 , pp. 205-211
    • Mitchell, M.W.1
  • 19
    • 77954485448 scopus 로고    scopus 로고
    • On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data
    • Schwarz DF, König IR, Ziegler A. On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data. Bioinformatics 2010, 26:1752-1758.
    • (2010) Bioinformatics , vol.26 , pp. 1752-1758
    • Schwarz, D.F.1    König, I.R.2    Ziegler, A.3
  • 20
    • 84890875591 scopus 로고    scopus 로고
    • Random Forests: some methodological insights. arXiv: 0811.3619. Available at:
    • Genuer R, Poggi JM, Tuleau C. Random Forests: some methodological insights. arXiv: 0811.3619. 2008. Available at: http://hal.inria.fr/inria-00340725/en/;
    • (2008)
    • Genuer, R.1    Poggi, J.M.2    Tuleau, C.3
  • 21
    • 84874230954 scopus 로고    scopus 로고
    • Decision tree induction & clustering techniques in SAS Enterprise Miner, SPSS Clementine, and IBM Intelligent Miner - a comparative analysis
    • Al Ghoson AM. Decision tree induction & clustering techniques in SAS Enterprise Miner, SPSS Clementine, and IBM Intelligent Miner - a comparative analysis. Int J Manag Inf Syst 2010, 14:57-70.
    • (2010) Int J Manag Inf Syst , vol.14 , pp. 57-70
    • Al Ghoson, A.M.1
  • 23
    • 84880692052 scopus 로고    scopus 로고
    • A brief introduction to boosting
    • Dean TL, ed. IJCAI-99: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, San Francisco, CA: Morgan Kaufmann;
    • Schapire RE. A brief introduction to boosting. In: Dean TL, ed. IJCAI-99: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, vol. 2. San Francisco, CA: Morgan Kaufmann; 1999, 1401-1406.
    • (1999) , vol.2 , pp. 1401-1406
    • Schapire, R.E.1
  • 24
    • 0034164230 scopus 로고    scopus 로고
    • Additive logistic regression: a statistical view of boosting
    • Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting. Ann Stat 2000, 28:337-407.
    • (2000) Ann Stat , vol.28 , pp. 337-407
    • Friedman, J.1    Hastie, T.2    Tibshirani, R.3
  • 25
    • 33749254096 scopus 로고    scopus 로고
    • An empirical comparison of supervised learning algorithms
    • Cohen W, Moore A, eds. New York: Association for Computing Machinery;
    • Caruana R, Niculescu-Mizil A. An empirical comparison of supervised learning algorithms. In: Cohen W, Moore A, eds. Proceedings of the 23rd International Conference on Machine Learning. New York: Association for Computing Machinery; 2006, 161-168.
    • (2006) Proceedings of the 23rd International Conference on Machine Learning , pp. 161-168
    • Caruana, R.1    Niculescu-Mizil, A.2
  • 26
    • 33749677657 scopus 로고    scopus 로고
    • Unbiased recursive partitioning: a conditional inference framework
    • Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 2006, 15:651-674.
    • (2006) J Comput Graph Stat , vol.15 , pp. 651-674
    • Hothorn, T.1    Hornik, K.2    Zeileis, A.3
  • 27
    • 54249099241 scopus 로고    scopus 로고
    • Consistency of random forests and other averaging classifiers
    • Biau G, Devroye L, Lugosi G. Consistency of random forests and other averaging classifiers. J Mach Learn Res 2008, 9:2039-2057.
    • (2008) J Mach Learn Res , vol.9 , pp. 2039-2057
    • Biau, G.1    Devroye, L.2    Lugosi, G.3
  • 28
    • 77956747417 scopus 로고    scopus 로고
    • On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification
    • Biau G, Devroye L. On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification. J Multivariate Anal 2010, 101:2499-2518.
    • (2010) J Multivariate Anal , vol.101 , pp. 2499-2518
    • Biau, G.1    Devroye, L.2
  • 29
    • 84860701629 scopus 로고    scopus 로고
    • Analysis of a random forests model
    • Biau G. Analysis of a random forests model. J Mach Learn Res 2012, 13:1063-1095.
    • (2012) J Mach Learn Res , vol.13 , pp. 1063-1095
    • Biau, G.1
  • 30
    • 84865253237 scopus 로고    scopus 로고
    • Variance reduction in purely random forests
    • Genuer R. Variance reduction in purely random forests. J Nonparametric Stat 2012, 24:543-562.
    • (2012) J Nonparametric Stat , vol.24 , pp. 543-562
    • Genuer, R.1
  • 31
    • 84890870316 scopus 로고    scopus 로고
    • Some infinity theory for predictor ensembles. Available at:
    • Breiman L. Some infinity theory for predictor ensembles. 2000. Available at: http://digitalassets.lib.berkeley.edu/sdtr/ucb/text/579.pdf.
    • (2000)
    • Breiman, L.1
  • 32
    • 41949115461 scopus 로고    scopus 로고
    • Random survival forests for R
    • Ishwaran H, Kogalur UB. Random survival forests for R. R-News 2007, 7:25-31.
    • (2007) R-News , vol.7 , pp. 25-31
    • Ishwaran, H.1    Kogalur, U.B.2
  • 33
    • 33847096395 scopus 로고    scopus 로고
    • Bias in random forest variable importance measures: illustrations, sources and a solution
    • Strobl C, Boulesteix AL, Zeileis A, Hothorn T. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 2007, 8:25.
    • (2007) BMC Bioinformatics , vol.8 , pp. 25
    • Strobl, C.1    Boulesteix, A.L.2    Zeileis, A.3    Hothorn, T.4
  • 34
    • 67650770061 scopus 로고    scopus 로고
    • Predictor correlation impacts machine learning algorithms: implications for genomic studies
    • Nicodemus KK, Malley JD. Predictor correlation impacts machine learning algorithms: implications for genomic studies. Bioinformatics 2009, 25:1884-1890.
    • (2009) Bioinformatics , vol.25 , pp. 1884-1890
    • Nicodemus, K.K.1    Malley, J.D.2
  • 36
    • 84861813244 scopus 로고    scopus 로고
    • Random forest Gini importance favours SNPs with large minor allele frequency: impact, sources and recommendations
    • Boulesteix AL, Bender A, Lorenzo Bermejo J, Strobl C. Random forest Gini importance favours SNPs with large minor allele frequency: impact, sources and recommendations. Brief Bioinform 2012, 13:292-304.
    • (2012) Brief Bioinform , vol.13 , pp. 292-304
    • Boulesteix, A.L.1    Bender, A.2    Lorenzo Bermejo, J.3    Strobl, C.4
  • 37
    • 30644464444 scopus 로고    scopus 로고
    • Gene selection and classification of microarray data using random forest
    • Díaz-Uriarte R, Alvarez de Andrés S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics 2006, 7:3.
    • (2006) BMC Bioinformatics , vol.7 , pp. 3
    • Díaz-Uriarte, R.1    Alvarez de Andrés, S.2
  • 39
    • 84890871129 scopus 로고    scopus 로고
    • Robustness of the random forest-based gene selection methods. Available at:
    • Kursa MB. Robustness of the random forest-based gene selection methods. 2013. Available at: http://arxiv.org/abs/1305.4525.
    • (2013)
    • Kursa, M.B.1
  • 40
    • 75149176440 scopus 로고    scopus 로고
    • Use of wrapper algorithms coupled with a random forests classifier for variable selection in large-scale genomic association studies
    • Rodin AS, Litvinenko A, Klos K, Morrison AC, Woodage T, Coresh J, Boerwinkle E. Use of wrapper algorithms coupled with a random forests classifier for variable selection in large-scale genomic association studies. J Comput Biol 2009, 16:1705-1718.
    • (2009) J Comput Biol , vol.16 , pp. 1705-1718
    • Rodin, A.S.1    Litvinenko, A.2    Klos, K.3    Morrison, A.C.4    Woodage, T.5    Coresh, J.6    Boerwinkle, E.7
  • 42
    • 84863447426 scopus 로고    scopus 로고
    • Search for the smallest random forest
    • Zhang H, Wang M. Search for the smallest random forest. Stat. its interface 2009, 2:381.
    • (2009) Stat. its interface , vol.2 , pp. 381
    • Zhang, H.1    Wang, M.2
  • 43
    • 84862685421 scopus 로고    scopus 로고
    • Identifying representative trees from ensembles
    • Banerjee M, Ding Y, Noone AM. Identifying representative trees from ensembles. Stat Med 2012, 31:1601-1616.
    • (2012) Stat Med , vol.31 , pp. 1601-1616
    • Banerjee, M.1    Ding, Y.2    Noone, A.M.3
  • 44
    • 84866731649 scopus 로고    scopus 로고
    • Risk estimation and risk prediction using machine learning methods
    • Kruppa J, Ziegler A, König IR. Risk estimation and risk prediction using machine learning methods. Hum Genet 2012, 131:1639-1654.
    • (2012) Hum Genet , vol.131 , pp. 1639-1654
    • Kruppa, J.1    Ziegler, A.2    König, I.R.3
  • 45
    • 30344458073 scopus 로고    scopus 로고
    • Comparison of the power between microsatellite and single-nucleotide polymorphism markers for linkage and linkage disequilibrium mapping of an electrophysiological phenotype
    • Lin HF, Juo SH, Cheng R. Comparison of the power between microsatellite and single-nucleotide polymorphism markers for linkage and linkage disequilibrium mapping of an electrophysiological phenotype. BMC Genet 2005, 6:S7.
    • (2005) BMC Genet , vol.6
    • Lin, H.F.1    Juo, S.H.2    Cheng, R.3
  • 46
    • 80255133264 scopus 로고    scopus 로고
    • An experimental comparison of classification algorithms for imbalanced credit scoring data sets
    • Brown I, Mues C. An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Exp Syst Appl 2012, 39:3446-3453.
    • (2012) Exp Syst Appl , vol.39 , pp. 3446-3453
    • Brown, I.1    Mues, C.2
  • 47
    • 34249981761 scopus 로고    scopus 로고
    • Weather regime prediction using statistical learning
    • Deloncle A, Berk R, D'Andrea F, Ghil M. Weather regime prediction using statistical learning. J Atmos Sci 2007, 64:1619-1635.
    • (2007) J Atmos Sci , vol.64 , pp. 1619-1635
    • Deloncle, A.1    Berk, R.2    D'Andrea, F.3    Ghil, M.4
  • 48
    • 13344278660 scopus 로고    scopus 로고
    • Random forest classifier for remote sensing classification
    • Pal M. Random forest classifier for remote sensing classification. Int J Remote Sens 2005, 26:217-222.
    • (2005) Int J Remote Sens , vol.26 , pp. 217-222
    • Pal, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.