메뉴 건너뛰기




Volumn 9, Issue , 2017, Pages 28-46

Random Forests for Big Data

Author keywords

Bag of little bootstraps; Big Data; On line learning; Parallel computing; R; Random forest

Indexed keywords

DECISION TREES; E-LEARNING; REGRESSION ANALYSIS;

EID: 85028618822     PISSN: None     EISSN: 22145796     Source Type: Journal    
DOI: 10.1016/j.bdr.2017.07.003     Document Type: Article
Times cited : (268)

References (50)
  • 1
    • 84919389078 scopus 로고    scopus 로고
    • Challenges of big data analysis
    • Fan, J., Han, F., Liu, H., Challenges of big data analysis. Nat. Sci. Rev. 1:2 (2014), 293–314, 10.1093/nsr/nwt032.
    • (2014) Nat. Sci. Rev. , vol.1 , Issue.2 , pp. 293-314
    • Fan, J.1    Han, F.2    Liu, H.3
  • 2
    • 84903270767 scopus 로고    scopus 로고
    • Applying statistical thinking to ‘Big Data’ problems
    • Hoerl, R., Snee, R., De Veaux, R., Applying statistical thinking to ‘Big Data’ problems. Wiley Interdiscip. Rev.: Comput. Stat. 6:4 (2014), 222–232, 10.1002/wics.1306.
    • (2014) Wiley Interdiscip. Rev.: Comput. Stat. , vol.6 , Issue.4 , pp. 222-232
    • Hoerl, R.1    Snee, R.2    De Veaux, R.3
  • 3
    • 84885041315 scopus 로고    scopus 로고
    • On statistics, computation and scalability
    • Jordan, M., On statistics, computation and scalability. Bernoulli 19:4 (2013), 1378–1390, 10.3150/12-BEJSP17.
    • (2013) Bernoulli , vol.19 , Issue.4 , pp. 1378-1390
    • Jordan, M.1
  • 5
    • 85029404715 scopus 로고    scopus 로고
    • Big data – Retour vers le futur 3. De statisticien à data scientist
    • arXiv:1403.3758 arXiv preprint
    • Besse, P., Garivier, A., Loubes, J., Big data – Retour vers le futur 3. De statisticien à data scientist. arXiv preprint arXiv:1403.3758, 2014.
    • (2014)
    • Besse, P.1    Garivier, A.2    Loubes, J.3
  • 6
    • 84926319635 scopus 로고    scopus 로고
    • Big data for modern industry: challenges and trends
    • Yin, S., Kaynak, O., Big data for modern industry: challenges and trends. Proceedings of the IEEE, vol. 103, 2015, 143–146.
    • (2015) Proceedings of the IEEE , vol.103 , pp. 143-146
    • Yin, S.1    Kaynak, O.2
  • 7
    • 84888087862 scopus 로고    scopus 로고
    • Scalable strategies for computing with massive data
    • Kane, M., Emerson, J., Weston, S., Scalable strategies for computing with massive data. J. Stat. Softw., 55, 2013 http://www.jstatsoft.org/v55/i14.
    • (2013) J. Stat. Softw. , vol.55
    • Kane, M.1    Emerson, J.2    Weston, S.3
  • 8
    • 84975753826 scopus 로고    scopus 로고
    • R: A Language and Environment for Statistical Computing
    • R Foundation for Statistical Computing Vienna, Austria
    • R Core Team. R: A Language and Environment for Statistical Computing. 2016, R Foundation for Statistical Computing, Vienna, Austria http://www.R-project.org.
    • (2016)
  • 9
    • 85029433230 scopus 로고    scopus 로고
    • Statistique et big data analytics. Volumétrie, l'attaque des clones
    • arXiv:1405.6676 arXiv preprint
    • Besse, P., Villa-Vialaneix, N., Statistique et big data analytics. Volumétrie, l'attaque des clones. arXiv preprint arXiv:1405.6676, 2014.
    • (2014)
    • Besse, P.1    Villa-Vialaneix, N.2
  • 10
    • 84979940943 scopus 로고    scopus 로고
    • A survey of statistical methods and computing for big data
    • arXiv:1502.07989 arXiv preprint
    • Wang, C., Chen, M., Schifano, E., Wu, J., Yan, J., A survey of statistical methods and computing for big data. arXiv preprint arXiv:1502.07989, 2015.
    • (2015)
    • Wang, C.1    Chen, M.2    Schifano, E.3    Wu, J.4    Yan, J.5
  • 16
    • 56049109090 scopus 로고    scopus 로고
    • Map-Reduce for machine learning on multicore
    • J. Lafferty C. Williams J. Shawe-Taylor R. Zemel A. Culotta Hyatt Regency, Vancouver, Canada
    • Chu, C., Kim, S., Lin, Y., Yu, Y., Bradski, G., Ng, A., Olukotun, K., Map-Reduce for machine learning on multicore. Lafferty, J., Williams, C., Shawe-Taylor, J., Zemel, R., Culotta, A., (eds.) Advances in Neural Information Processing Systems (NIPS 2010), Hyatt Regency, Vancouver, Canada, vol. 23, 2010, 281–288.
    • (2010) Advances in Neural Information Processing Systems (NIPS 2010) , vol.23 , pp. 281-288
    • Chu, C.1    Kim, S.2    Lin, Y.3    Yu, Y.4    Bradski, G.5    Ng, A.6    Olukotun, K.7
  • 17
    • 84946566592 scopus 로고    scopus 로고
    • A split-and-conquer approach for analysis of extraordinarily large data
    • Chen, X., Xie, M., A split-and-conquer approach for analysis of extraordinarily large data. Stat. Sin. 24 (2014), 1655–1684.
    • (2014) Stat. Sin. , vol.24 , pp. 1655-1684
    • Chen, X.1    Xie, M.2
  • 18
    • 84875499797 scopus 로고    scopus 로고
    • Computational and statistical tradeoffs via convex relaxation
    • Chandrasekaran, V., Jordan, M., Computational and statistical tradeoffs via convex relaxation. Proc. Natl. Acad. Sci. USA 13 (2013), E1181–E1190.
    • (2013) Proc. Natl. Acad. Sci. USA , vol.13 , pp. E1181-E1190
    • Chandrasekaran, V.1    Jordan, M.2
  • 19
    • 33745777639 scopus 로고    scopus 로고
    • Incremental support vector learning: analysis, implementation and application
    • Laskov, P., Gehl, C., Krüger, S., Müller, K., Incremental support vector learning: analysis, implementation and application. J. Mach. Learn. Res. 7 (2006), 1909–1936.
    • (2006) J. Mach. Learn. Res. , vol.7 , pp. 1909-1936
    • Laskov, P.1    Gehl, C.2    Krüger, S.3    Müller, K.4
  • 21
    • 0035478854 scopus 로고    scopus 로고
    • Random forests
    • Breiman, L., Random forests. Mach. Learn. 45:1 (2001), 5–32 http://www.springerlink.com/content/u0p06167n6173512/fulltext.pdf.
    • (2001) Mach. Learn. , vol.45 , Issue.1 , pp. 5-32
    • Breiman, L.1
  • 22
    • 84933565370 scopus 로고    scopus 로고
    • Consistency of random forests
    • Scornet, E., Biau, G., Vert, J., Consistency of random forests. Ann. Stat. 43:4 (2015), 1716–1741, 10.1214/15-AOS1321.
    • (2015) Ann. Stat. , vol.43 , Issue.4 , pp. 1716-1741
    • Scornet, E.1    Biau, G.2    Vert, J.3
  • 23
    • 77958064179 scopus 로고    scopus 로고
    • Mining data with random forests: a survey and results of new tests
    • Verikas, A., Gelzinis, A., Bacauskiene, M., Mining data with random forests: a survey and results of new tests. Pattern Recognit. 44:2 (2011), 330–349, 10.1016/j.patcog.2010.08.011.
    • (2011) Pattern Recognit. , vol.44 , Issue.2 , pp. 330-349
    • Verikas, A.1    Gelzinis, A.2    Bacauskiene, M.3
  • 24
    • 84890868650 scopus 로고    scopus 로고
    • Mining data with random forests: current options for real-world applications
    • Ziegler, A., König, I., Mining data with random forests: current options for real-world applications. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 4:1 (2014), 55–63, 10.1002/widm.1114.
    • (2014) Wiley Interdiscip. Rev. Data Min. Knowl. Discov. , vol.4 , Issue.1 , pp. 55-63
    • Ziegler, A.1    König, I.2
  • 25
    • 33846516584 scopus 로고    scopus 로고
    • Pattern Recognition and Machine Learning
    • Springer-Verlag New York, NY, USA
    • Bishop, C., Pattern Recognition and Machine Learning. 2006, Springer-Verlag, New York, NY, USA.
    • (2006)
    • Bishop, C.1
  • 26
    • 0003684449 scopus 로고    scopus 로고
    • The Elements of Statistical Learning
    • 2nd edition Springer-Verlag New York, NY, USA
    • Hastie, T., Tibshirani, R., Friedman, J., The Elements of Statistical Learning. 2nd edition, 2009, Springer-Verlag, New York, NY, USA.
    • (2009)
    • Hastie, T.1    Tibshirani, R.2    Friedman, J.3
  • 27
    • 0003802343 scopus 로고
    • Classification and Regression Trees
    • Chapman and Hall New York, USA
    • Breiman, L., Friedman, J., Olsen, R., Stone, C., Classification and Regression Trees. 1984, Chapman and Hall, New York, USA.
    • (1984)
    • Breiman, L.1    Friedman, J.2    Olsen, R.3    Stone, C.4
  • 28
    • 77957922514 scopus 로고    scopus 로고
    • Variable selection using random forests
    • Genuer, R., Poggi, J., Tuleau-Malot, C., Variable selection using random forests. Pattern Recognit. Lett. 31:14 (2010), 2225–2236, 10.1016/j.patrec.2010.03.014.
    • (2010) Pattern Recognit. Lett. , vol.31 , Issue.14 , pp. 2225-2236
    • Genuer, R.1    Poggi, J.2    Tuleau-Malot, C.3
  • 29
    • 0642310183 scopus 로고    scopus 로고
    • Resampling fewer than n observations: gains, losses and remedies for losses
    • Bickel, P., Götze, F., van Zwet, W., Resampling fewer than n observations: gains, losses and remedies for losses. Stat. Sin. 7:1 (1997), 1–31.
    • (1997) Stat. Sin. , vol.7 , Issue.1 , pp. 1-31
    • Bickel, P.1    Götze, F.2    van Zwet, W.3
  • 30
    • 53349123556 scopus 로고    scopus 로고
    • On the choice of m in the m out of n bootstrap and confidence bounds for extrema
    • Bickel, P., Sakov, A., On the choice of m in the m out of n bootstrap and confidence bounds for extrema. Stat. Sin. 18:3 (2008), 967–985 http://www3.stat.sinica.edu.tw/statistica/J18N3/J18N38/J18N38.html.
    • (2008) Stat. Sin. , vol.18 , Issue.3 , pp. 967-985
    • Bickel, P.1    Sakov, A.2
  • 31
    • 84906873734 scopus 로고    scopus 로고
    • On the use of MapReduce for imbalanced big data using random forest
    • del Rio, S., López, V., Benítez, J., Herrera, F., On the use of MapReduce for imbalanced big data using random forest. Inf. Sci. 285 (2014), 112–137, 10.1016/j.ins.2014.03.043.
    • (2014) Inf. Sci. , vol.285 , pp. 112-137
    • del Rio, S.1    López, V.2    Benítez, J.3    Herrera, F.4
  • 33
    • 27944503134 scopus 로고    scopus 로고
    • Online Bayesian bagging
    • Lee, H., Clyde, M., Online Bayesian bagging. J. Mach. Learn. Res. 5 (2004), 143–151.
    • (2004) J. Mach. Learn. Res. , vol.5 , pp. 143-151
    • Lee, H.1    Clyde, M.2
  • 34
    • 33745697989 scopus 로고    scopus 로고
    • Creating non-parametric bootstrap samples using Poisson frequencies
    • Hanley, J., MacGibbon, B., Creating non-parametric bootstrap samples using Poisson frequencies. Comput. Methods Programs Biomed. 83 (2006), 57–62.
    • (2006) Comput. Methods Programs Biomed. , vol.83 , pp. 57-62
    • Hanley, J.1    MacGibbon, B.2
  • 35
    • 33646430006 scopus 로고    scopus 로고
    • Extremely randomized trees
    • Geurts, P., Ernst, D., Wehenkel, L., Extremely randomized trees. Mach. Learn. 63:1 (2006), 3–42, 10.1007/s10994-006-6226-1.
    • (2006) Mach. Learn. , vol.63 , Issue.1 , pp. 3-42
    • Geurts, P.1    Ernst, D.2    Wehenkel, L.3
  • 37
    • 84971637693 scopus 로고    scopus 로고
    • readr: Read Tabular Data
    • R package version 0.2.2
    • Wickham, H., François, R., readr: Read Tabular Data. R package version 0.2.2 http://CRAN.R-project.org/package=readr, 2015.
    • (2015)
    • Wickham, H.1    François, R.2
  • 38
    • 0345040873 scopus 로고    scopus 로고
    • Classification and regression by randomForest
    • Liaw, A., Wiener, M., Classification and regression by randomForest. R News 2:3 (2002), 18–22 http://CRAN.R-project.org/doc/Rnews.
    • (2002) R News , vol.2 , Issue.3 , pp. 18-22
    • Liaw, A.1    Wiener, M.2
  • 40
    • 84953731160 scopus 로고    scopus 로고
    • foreach: Foreach looping construct for R
    • R package version 1.4.2
    • Revolution Analytics, Weston, S., foreach: Foreach looping construct for R. R package version 1.4.2 http://CRAN.R-project.org/package=foreach, 2014.
    • (2014)
    • Revolution Analytics1    Weston, S.2
  • 41
    • 85029452867 scopus 로고    scopus 로고
    • An outlier detection-based tree selection approach to extreme pruning of random forests
    • arXiv:1503.05187 arXiv preprint
    • Fawagreh, K., Gaber, M., Elyan, E., An outlier detection-based tree selection approach to extreme pruning of random forests. arXiv preprint arXiv:1503.05187, 2015.
    • (2015)
    • Fawagreh, K.1    Gaber, M.2    Elyan, E.3
  • 42
    • 33749018252 scopus 로고    scopus 로고
    • An analysis of diversity measures
    • Tang, E., Suganthan, P., Yao, X., An analysis of diversity measures. Mach. Learn. 65 (2006), 247–271.
    • (2006) Mach. Learn. , vol.65 , pp. 247-271
    • Tang, E.1    Suganthan, P.2    Yao, X.3
  • 44
    • 0031211090 scopus 로고    scopus 로고
    • A decision-theoretic generalization of on-line learning and an application to boosting
    • Freund, Y., Schapire, R., A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55:1 (1997), 119–139.
    • (1997) J. Comput. Syst. Sci. , vol.55 , Issue.1 , pp. 119-139
    • Freund, Y.1    Schapire, R.2
  • 45
    • 33646403804 scopus 로고    scopus 로고
    • Pert-perfect random tree ensembles
    • Cutler, A., Zhao, G., Pert-perfect random tree ensembles. Comput. Sci. Stat. 33 (2001), 490–497.
    • (2001) Comput. Sci. Stat. , vol.33 , pp. 490-497
    • Cutler, A.1    Zhao, G.2
  • 46
    • 54249099241 scopus 로고    scopus 로고
    • Consistency of random forests and other averaging classifiers
    • Biau, G., Devroye, L., Lugosi, G., Consistency of random forests and other averaging classifiers. J. Mach. Learn. Res. 9 (2008), 2015–2033.
    • (2008) J. Mach. Learn. Res. , vol.9 , pp. 2015-2033
    • Biau, G.1    Devroye, L.2    Lugosi, G.3
  • 47
    • 84961993377 scopus 로고    scopus 로고
    • Analysis of purely random forests bias
    • arXiv:1407.3939 arXiv preprint
    • Arlot, S., Genuer, R., Analysis of purely random forests bias. arXiv preprint arXiv:1407.3939, 2014.
    • (2014)
    • Arlot, S.1    Genuer, R.2
  • 48
    • 37349116573 scopus 로고    scopus 로고
    • Data Stream Management: Processing High-Speed Data Streams, Data-Centric Systems and Applications
    • Springer-Verlag Berlin, Heidelberg
    • Garofalakis, M., Gehrke, J., Rastogi, R., Data Stream Management: Processing High-Speed Data Streams, Data-Centric Systems and Applications. 2016, Springer-Verlag, Berlin, Heidelberg.
    • (2016)
    • Garofalakis, M.1    Gehrke, J.2    Rastogi, R.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.