메뉴 건너뛰기




Volumn 14, Issue , 2013, Pages

An AUC-based permutation variable importance measure for random forests

Author keywords

Area under the curve.; Class imbalance; Conditional inference trees; Feature selection; Random forest; Unbalanced data; Variable importance measure

Indexed keywords

AREA UNDER THE CURVES; CLASS IMBALANCE; CONDITIONAL INFERENCE; RANDOM FORESTS; UNBALANCED DATA; VARIABLE IMPORTANCES;

EID: 84875818068     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/1471-2105-14-119     Document Type: Article
Times cited : (195)

References (31)
  • 1
  • 4
    • 53349100175 scopus 로고    scopus 로고
    • Pathway analysis of single-nucleotide polymorphisms potentially associated with glioblastoma multiforme susceptibility using random forests
    • 10.1158/1055-9965.EPI-07-2830, 18559551
    • Chang J, Yeh R, Wiencke J, Wiemels J, Smirnov I, Pico A, Tihan T, Patoka J, Miike R, Sison J. Pathway analysis of single-nucleotide polymorphisms potentially associated with glioblastoma multiforme susceptibility using random forests. Cancer Epidemiol Biomarkers Prev 2008, 17(6):1368-1373. 10.1158/1055-9965.EPI-07-2830, 18559551.
    • (2008) Cancer Epidemiol Biomarkers Prev , vol.17 , Issue.6 , pp. 1368-1373
    • Chang, J.1    Yeh, R.2    Wiencke, J.3    Wiemels, J.4    Smirnov, I.5    Pico, A.6    Tihan, T.7    Patoka, J.8    Miike, R.9    Sison, J.10
  • 5
    • 79955562075 scopus 로고    scopus 로고
    • A genome-wide screen of gene-gene interactions for rheumatoid arthritis susceptibility
    • 10.1007/s00439-010-0943-z, 21210282
    • Liu C, Ackerman H, Carulli J. A genome-wide screen of gene-gene interactions for rheumatoid arthritis susceptibility. Hum Genet 2011, 129(5):473-485. 10.1007/s00439-010-0943-z, 21210282.
    • (2011) Hum Genet , vol.129 , Issue.5 , pp. 473-485
    • Liu, C.1    Ackerman, H.2    Carulli, J.3
  • 6
    • 77952095729 scopus 로고    scopus 로고
    • Evidence of statistical epistasis between DISC1, CIT and NDEL1 impacting risk for schizophrenia: biological validation with functional neuroimaging
    • 10.1007/s00439-009-0782-y, 20084519
    • Nicodemus K, Callicott J, Higier R, Luna A, Nixon D, Lipska B, Vakkalanka R, Giegling I, Rujescu D, Clair D. Evidence of statistical epistasis between DISC1, CIT and NDEL1 impacting risk for schizophrenia: biological validation with functional neuroimaging. Hum Genet 2010, 127(4):441-452. 10.1007/s00439-009-0782-y, 20084519.
    • (2010) Hum Genet , vol.127 , Issue.4 , pp. 441-452
    • Nicodemus, K.1    Callicott, J.2    Higier, R.3    Luna, A.4    Nixon, D.5    Lipska, B.6    Vakkalanka, R.7    Giegling, I.8    Rujescu, D.9    Clair, D.10
  • 7
    • 38049048625 scopus 로고    scopus 로고
    • Classification of rheumatoid arthritis status with candidate gene and genome-wide single-nucleotide polymorphisms using random forests
    • 10.1186/1753-6561-1-s1-s62, 2367463, 18466563
    • Sun Y, Cai Z, Desai K, Lawrance R, Leff R, Jawaid A, Kardia S, Yang H. Classification of rheumatoid arthritis status with candidate gene and genome-wide single-nucleotide polymorphisms using random forests. BMC Proceedings 2007, 1(Suppl 1):S62. 10.1186/1753-6561-1-s1-s62, 2367463, 18466563.
    • (2007) BMC Proceedings , vol.1 , Issue.SUPPL. 1
    • Sun, Y.1    Cai, Z.2    Desai, K.3    Lawrance, R.4    Leff, R.5    Jawaid, A.6    Kardia, S.7    Yang, H.8
  • 8
    • 77957988489 scopus 로고    scopus 로고
    • Class prediction for high-dimensional class-imbalanced data
    • 10.1186/1471-2105-11-523, 3098087, 20961420
    • Blagus R, Lusa L. Class prediction for high-dimensional class-imbalanced data. BMC Bioinformatics 2010, 11:523. 10.1186/1471-2105-11-523, 3098087, 20961420.
    • (2010) BMC Bioinformatics , vol.11 , pp. 523
    • Blagus, R.1    Lusa, L.2
  • 9
    • 84873193842 scopus 로고    scopus 로고
    • Class-imbalanced classifiers for high-dimensional data
    • Lin WJ, Chen J. Class-imbalanced classifiers for high-dimensional data. Brief Bioinform 2012,
    • (2012) Brief Bioinform
    • Lin, W.J.1    Chen, J.2
  • 10
    • 48649089002 scopus 로고    scopus 로고
    • An empirical study of learning from imbalanced data using random forest
    • ICTAI 2007: 19th IEEE International Conference on, Volume 2, IEEE
    • Khoshgoftaar T, Golawala M, Van Hulse J. An empirical study of learning from imbalanced data using random forest. Tools with Artificial Intelligence, 2007 2007, 310-317. ICTAI 2007: 19th IEEE International Conference on, Volume 2, IEEE.
    • (2007) Tools with Artificial Intelligence, 2007 , pp. 310-317
    • Khoshgoftaar, T.1    Golawala, M.2    Van Hulse, J.3
  • 11
    • 33646142788 scopus 로고    scopus 로고
    • Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem
    • Huang Y, Hung C, Jiau H. Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem. Nonlinear Analysis: Real World Applications 2006, 7(4):720-747.
    • (2006) Nonlinear Analysis: Real World Applications , vol.7 , Issue.4 , pp. 720-747
    • Huang, Y.1    Hung, C.2    Jiau, H.3
  • 13
    • 0031998121 scopus 로고    scopus 로고
    • Machine learning for the detection of oil spills in satellite radar images
    • Kubat M, Holte R, Matwin S. Machine learning for the detection of oil spills in satellite radar images. Machine Learning 1998, 30(2):195-215.
    • (1998) Machine Learning , vol.30 , Issue.2 , pp. 195-215
    • Kubat, M.1    Holte, R.2    Matwin, S.3
  • 15
    • 58349116623 scopus 로고    scopus 로고
    • Customer churn prediction using improved balanced random forests
    • Xie Y, Li X, Ngai E, Ying W. Customer churn prediction using improved balanced random forests. Expert Systems with Applications 2009, 36(3):5445-5449.
    • (2009) Expert Systems with Applications , vol.36 , Issue.3 , pp. 5445-5449
    • Xie, Y.1    Li, X.2    Ngai, E.3    Ying, W.4
  • 16
    • 27144531570 scopus 로고    scopus 로고
    • A study of the behavior of several methods for balancing machine learning training data
    • Batista G, Prati R, Monard M. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter 2004, 6:20-29.
    • (2004) ACM SIGKDD Explorations Newsletter , vol.6 , pp. 20-29
    • Batista, G.1    Prati, R.2    Monard, M.3
  • 17
    • 1442356040 scopus 로고    scopus 로고
    • A multiple resampling method for learning from imbalanced data sets
    • Estabrooks A, Jo T, Japkowicz N. A multiple resampling method for learning from imbalanced data sets. Computational Intelligence 2004, 20:18-36.
    • (2004) Computational Intelligence , vol.20 , pp. 18-36
    • Estabrooks, A.1    Jo, T.2    Japkowicz, N.3
  • 19
    • 71749101234 scopus 로고    scopus 로고
    • Knowledge discovery from imbalanced and noisy data
    • 10.1016/j.datak.2009.08.005, 23573530
    • Van Hulse J, Khoshgoftaar T. Knowledge discovery from imbalanced and noisy data. Data & Knowledge Engineering 2009, 68(12):1513-1542. 10.1016/j.datak.2009.08.005, 23573530.
    • (2009) Data & Knowledge Engineering , vol.68 , Issue.12 , pp. 1513-1542
    • Van Hulse, J.1    Khoshgoftaar, T.2
  • 20
    • 33845536164 scopus 로고    scopus 로고
    • The class imbalance problem: A systematic study
    • Japkowicz N, Stephen S. The class imbalance problem: A systematic study. Intelligent Data Analysis 2002, 6(5):429-449.
    • (2002) Intelligent Data Analysis , vol.6 , Issue.5 , pp. 429-449
    • Japkowicz, N.1    Stephen, S.2
  • 21
    • 79960872876 scopus 로고    scopus 로고
    • Predicting disease risks from highly imbalanced data using random forest
    • 10.1186/1472-6947-11-51, 3163175, 21801360
    • Khalilia M, Chakraborty S, Popescu M. Predicting disease risks from highly imbalanced data using random forest. BMC Med Inform Decis Mak 2011, 11:51. 10.1186/1472-6947-11-51, 3163175, 21801360.
    • (2011) BMC Med Inform Decis Mak , vol.11 , pp. 51
    • Khalilia, M.1    Chakraborty, S.2    Popescu, M.3
  • 22
    • 33847096395 scopus 로고    scopus 로고
    • Bias in random forest variable importance measures: Illustrations, sources and a solution
    • 10.1186/1471-2105-8-25, 1796903, 17254353
    • Strobl C, Boulesteix AL, Zeileis A, Hothorn T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics 2007, 8:25. 10.1186/1471-2105-8-25, 1796903, 17254353.
    • (2007) BMC Bioinformatics , vol.8 , pp. 25
    • Strobl, C.1    Boulesteix, A.L.2    Zeileis, A.3    Hothorn, T.4
  • 23
    • 67650770061 scopus 로고    scopus 로고
    • Predictor correlation impacts machine learning algorithms: implications for genomic studies
    • 10.1093/bioinformatics/btp331, 19460890
    • Nicodemus KK, Malley JD. Predictor correlation impacts machine learning algorithms: implications for genomic studies. Bioinformatics 2009, 25(15):1884-1890. 10.1093/bioinformatics/btp331, 19460890.
    • (2009) Bioinformatics , vol.25 , Issue.15 , pp. 1884-1890
    • Nicodemus, K.K.1    Malley, J.D.2
  • 24
    • 82255174148 scopus 로고    scopus 로고
    • Letter to the editor: On the stability and ranking of predictors from random forest variable importance measures
    • 10.1093/bib/bbr016, 3137934, 21498552
    • Nicodemus KK. Letter to the editor: On the stability and ranking of predictors from random forest variable importance measures. Brief Bioinform 2011, 12(4):369-373. 10.1093/bib/bbr016, 3137934, 21498552.
    • (2011) Brief Bioinform , vol.12 , Issue.4 , pp. 369-373
    • Nicodemus, K.K.1
  • 25
    • 84861813244 scopus 로고    scopus 로고
    • Random forest Gini importance favours SNPs with large minor allele frequency: assessment, sources and recommendations
    • 10.1093/bib/bbr053, 21908865
    • Boulesteix AL, Bender A, Bermejo JL, Strobl C. Random forest Gini importance favours SNPs with large minor allele frequency: assessment, sources and recommendations. Brief Bioinform 2012, 13:292-304. 10.1093/bib/bbr053, 21908865.
    • (2012) Brief Bioinform , vol.13 , pp. 292-304
    • Boulesteix, A.L.1    Bender, A.2    Bermejo, J.L.3    Strobl, C.4
  • 26
    • 80053915297 scopus 로고    scopus 로고
    • AUC-RF: A new strategy for genomic profiling with random forest
    • 10.1159/000330778, 21996641
    • Calle M, Urrea V, Boulesteix AL, Malats N. AUC-RF: A new strategy for genomic profiling with random forest. Hum Hered 2011, 72(2):121-132. 10.1159/000330778, 21996641.
    • (2011) Hum Hered , vol.72 , Issue.2 , pp. 121-132
    • Calle, M.1    Urrea, V.2    Boulesteix, A.L.3    Malats, N.4
  • 27
    • 33749677657 scopus 로고    scopus 로고
    • Unbiased recursive partitioning: A conditional inference framework
    • Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: A conditional inference framework. J Comput Graph Stat 2006, 15(3):651-674.
    • (2006) J Comput Graph Stat , vol.15 , Issue.3 , pp. 651-674
    • Hothorn, T.1    Hornik, K.2    Zeileis, A.3
  • 30
    • 13244255317 scopus 로고    scopus 로고
    • Simple statistical models predict C-to-U edited sites in plant mitochondrial RNA
    • 10.1186/1471-2105-5-132, 521485, 15373947
    • Cummings M, Myers D. Simple statistical models predict C-to-U edited sites in plant mitochondrial RNA. BMC Bioinformatics 2004, 5:132. 10.1186/1471-2105-5-132, 521485, 15373947.
    • (2004) BMC Bioinformatics , vol.5 , pp. 132
    • Cummings, M.1    Myers, D.2
  • 31
    • 77949388276 scopus 로고    scopus 로고
    • The behavior of random forest permutation-based variable importance measures under predictor correlation
    • 10.1186/1471-2105-11-110, 2848005, 20187966
    • Nicodemus KK, Malley J, Strobl C, Ziegler A. The behavior of random forest permutation-based variable importance measures under predictor correlation. BMC Bioinformatics 2010, 11:110. 10.1186/1471-2105-11-110, 2848005, 20187966.
    • (2010) BMC Bioinformatics , vol.11 , pp. 110
    • Nicodemus, K.K.1    Malley, J.2    Strobl, C.3    Ziegler, A.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.