메뉴 건너뛰기




Volumn 28, Issue 20, 2012, Pages 2615-2623

An integrated approach to reduce the impact of minor allele frequency and linkage disequilibrium on variable importance measures for genome-wide data

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHM; ARTICLE; EVALUATION; GENE FREQUENCY; GENE LINKAGE DISEQUILIBRIUM; GENETIC ASSOCIATION; HUMAN; HUMAN GENOME; SINGLE NUCLEOTIDE POLYMORPHISM;

EID: 84870416715     PISSN: 13674803     EISSN: 14602059     Source Type: Journal    
DOI: 10.1093/bioinformatics/bts483     Document Type: Article
Times cited : (11)

References (51)
  • 1
    • 0000937686 scopus 로고
    • Tests for linear trends in proportions and frequencies
    • Annitage, P. (1955) Tests for linear trends in proportions and frequencies. Biometrics, 11, 375-386.
    • (1955) Biometrics , vol.11 , pp. 375-386
    • Annitage, P.1
  • 2
    • 78649523088 scopus 로고    scopus 로고
    • SNP selection in genome-wide and candidate gene studies via penalized logistic regression
    • Ayers, K.L. and Cordell, HJ. (2010) SNP selection in genome-wide and candidate gene studies via penalized logistic regression. Genet. Epidemiol, 34, 879-891.
    • (2010) Genet. Epidemiol , vol.34 , pp. 879-891
    • Ayers, K.L.1    Cordell, H.J.2
  • 3
    • 13444269543 scopus 로고    scopus 로고
    • Haploview: Analysis and visualization of LD and haplotype maps
    • Barrett, J.C. et al (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics, 21, 263-265.
    • (2005) Bioinformatics , vol.21 , pp. 263-265
    • Barrett, J.C.1
  • 4
    • 0037111753 scopus 로고    scopus 로고
    • A new bivariate binomial distribution
    • Biswas, A. and Hwang, J. (2002) A new bivariate binomial distribution. Stat. Probab. Lett., 60, 231-240.
    • (2002) Stat. Probab. Lett. , vol.60 , pp. 231-240
    • Biswas, A.1    Hwang, J.2
  • 5
    • 84861813244 scopus 로고    scopus 로고
    • Random forest Gini importance favours SNPs with large minor allele frequency: Impact, sources and recommendations
    • Boulesteix, A. et al. (2011) Random forest Gini importance favours SNPs with large minor allele frequency: impact, sources and recommendations. Brief. Bioinform, 13, 292-304.
    • (2011) Brief. Bioinform , vol.13 , pp. 292-304
    • Boulesteix, A.1
  • 6
    • 0035478854 scopus 로고    scopus 로고
    • Random forests
    • Breiman, L. (2001) Random forests. Mach. Learn., 45, 5-32.
    • (2001) Mach. Learn. , vol.45 , pp. 5-32
    • Breiman, L.1
  • 8
    • 12744259874 scopus 로고    scopus 로고
    • Identifying SNPs predictive of phenotype using random forests
    • Bureau, A. et al. (2005) Identifying SNPs predictive of phenotype using random forests. Genet. Epidemiol, 28, 171-182.
    • (2005) Genet. Epidemiol , vol.28 , pp. 171-182
    • Bureau, A.1
  • 10
    • 0001072895 scopus 로고
    • The use of confidence or fiducial limits illustrated in the case of the binomial
    • Clopper, C. and Pearson, E. (1934) The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26, 404-413.
    • (1934) Biometrika , vol.26 , pp. 404-413
    • Clopper, C.1    Pearson, E.2
  • 11
    • 62549085618 scopus 로고    scopus 로고
    • Human genetic variation and its contribution to complex traits
    • Frazer, K.A. et al (2009) Human genetic variation and its contribution to complex traits. Nat. Rev. Genet., 10, 241-251.
    • (2009) Nat. Rev. Genet. , vol.10 , pp. 241-251
    • Frazer, K.A.1
  • 12
    • 0035470889 scopus 로고    scopus 로고
    • Greedy function approximation: A gradient boosting machine
    • Friedman, J.H. (2001) Greedy function approximation: a gradient boosting machine. Ann. Statist., 29(5), 1189-1232.
    • (2001) Ann. Statist. , vol.29 , Issue.5 , pp. 1189-1232
    • Friedman, J.H.1
  • 13
    • 84944811700 scopus 로고
    • The use of ranks to avoid the assumption of normality implicit in the analysis of variance
    • Friedman, M. (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Ass., 32(200), 675-701.
    • (1937) J. Am. Stat. Ass. , vol.32 , Issue.200 , pp. 675-701
    • Friedman, M.1
  • 14
    • 65449132029 scopus 로고    scopus 로고
    • Evaluating the ability of tree-based methods and logistic regression for the detection of SNP-SNP interaction
    • Garcia-Magarinos, M. et al. (2009) Evaluating the ability of tree-based methods and logistic regression for the detection of SNP-SNP interaction. Ann. Hum. Genet., 73, 360-369.
    • (2009) Ann. Hum. Genet. , vol.73 , pp. 360-369
    • Garcia-Magarinos, M.1
  • 15
    • 77953351710 scopus 로고    scopus 로고
    • An application of random forests to a genome-wide association dataset: Methodological considerations & new findings
    • Goldstein, B.A. et al (2010) An application of random forests to a genome-wide association dataset: methodological considerations & new findings. BMC Genet., 11, 49.
    • (2010) BMC Genet. , vol.11 , pp. 49
    • Goldstein, B.A.1
  • 17
    • 78650540144 scopus 로고    scopus 로고
    • A variable selection method for genome-wide association studies
    • He, Q. and Lin, D. (2011) A variable selection method for genome-wide association studies. Bioinformatics, 27(1), 1-8.
    • (2011) Bioinformatics , vol.27 , Issue.1 , pp. 1-8
    • He, Q.1    Lin, D.2
  • 18
    • 84943709252 scopus 로고
    • Use of ranks in one-criterion variance analysis
    • Kruskal, W.H. and Wallis, W.A. (1952) Use of ranks in one-criterion variance analysis. J. Am. Stat. Ass., 47, 583-621.
    • (1952) J. Am. Stat. Ass. , vol.47 , pp. 583-621
    • Kruskal, W.H.1    Wallis, W.A.2
  • 19
    • 79951530319 scopus 로고    scopus 로고
    • The Bayesian lasso for genome-wide association studies
    • Li, J. et al (2011a) The Bayesian lasso for genome-wide association studies. Bioinformatics, 27, 516-523.
    • (2011) Bioinformatics , vol.27 , pp. 516-523
    • Li, J.1
  • 20
    • 79959442576 scopus 로고    scopus 로고
    • Detecting epistatic effects in association studies at a genomic level based on an ensemble approach
    • Li, J. et al (201 lb) Detecting epistatic effects in association studies at a genomic level based on an ensemble approach. Bioinformatics, 27, i222-i229.
    • (2011) Bioinformatics , vol.27
    • Li, J.1
  • 21
    • 0345040873 scopus 로고    scopus 로고
    • Classification and regression by randomForest
    • Liaw, A. and Wiener, M. (2002) Classification and regression by randomForest. R News, 2, 18-22.
    • (2002) R News , vol.2 , pp. 18-22
    • Liaw, A.1    Wiener, M.2
  • 22
    • 25444453244 scopus 로고    scopus 로고
    • Screening large-scale association study data: Exploiting interactions using random forests
    • Lunetta, K.L. et al (2004) Screening large-scale association study data: exploiting interactions using random forests. BMC Genet., 5, 32.
    • (2004) BMC Genet. , vol.5 , pp. 32
    • Lunetta, K.L.1
  • 23
    • 55549147191 scopus 로고    scopus 로고
    • Personal genomes: The case of the missing heritability
    • Maher, B. (2008) Personal genomes: the case of the missing heritability. Nature, 456, 18-21.
    • (2008) Nature , vol.456 , pp. 18-21
    • Maher, B.1
  • 24
    • 70349956433 scopus 로고    scopus 로고
    • Finding the missing heritability of complex diseases
    • Manolio, T.A. et al (2009) Finding the missing heritability of complex diseases. Nature, 461, 747-753.
    • (2009) Nature , vol.461 , pp. 747-753
    • Manolio, T.A.1
  • 25
    • 42349112088 scopus 로고    scopus 로고
    • Genome-wide association studies for complex traits: Consensus, uncertainty and challenges
    • McCarthy, M.I. et al (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet., 9, 356-369.
    • (2008) Nat. Rev. Genet. , vol.9 , pp. 356-369
    • McCarthy, M.I.1
  • 26
    • 71849088051 scopus 로고    scopus 로고
    • Common variants in the Trichohyalin gene are associated with straight hair in Europeans
    • Medland, S. et al (2009) Common variants in the Trichohyalin gene are associated with straight hair in Europeans. Am. J. Hum. Genet., 85, 750-755.
    • (2009) Am. J. Hum. Genet. , vol.85 , pp. 750-755
    • Medland, S.1
  • 27
    • 64549095229 scopus 로고    scopus 로고
    • Performance of random forest when SNPs are in linkage disequilibrium
    • Meng, Y.A. et al (2009) Performance of random forest when SNPs are in linkage disequilibrium. BMC Bioinformatics, 10, 78.
    • (2009) BMC Bioinformatics , vol.10 , pp. 78
    • Meng, Y.A.1
  • 28
    • 0345411335 scopus 로고    scopus 로고
    • The ubiquitous nature of epistasis in determining susceptibility to common human diseases
    • Moore, J.H. (2003) The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered., 56, 73-82.
    • (2003) Hum. Hered. , vol.56 , pp. 73-82
    • Moore, J.H.1
  • 29
    • 67650770061 scopus 로고    scopus 로고
    • Predictor correlation impacts machine learning algorithms: Implications for genomic studies
    • Nicodemus, K.K. and Malley, J.D. (2009) Predictor correlation impacts machine learning algorithms: implications for genomic studies. Bioinformatics, 25, 1884-1890.
    • (2009) Bioinformatics , vol.25 , pp. 1884-1890
    • Nicodemus, K.K.1    Malley, J.D.2
  • 30
    • 58149343568 scopus 로고    scopus 로고
    • Application of two machine learning algorithms to genetic association studies in the presence of covariates
    • Nonyane, B. and Foulkes, A.S. (2008) Application of two machine learning algorithms to genetic association studies in the presence of covariates. BMC Genet., 9, 71.
    • (2008) BMC Genet. , vol.9 , pp. 71
    • Nonyane, B.1    Foulkes, A.S.2
  • 31
    • 84859786369 scopus 로고    scopus 로고
    • A comparison of random forests, boosting and support vector machines for genomic selection
    • Sil
    • Ogutu, J.O. et al (2011) A comparison of random forests, boosting and support vector machines for genomic selection. BMC Proc, 5(Suppl 3), Sil.
    • (2011) BMC Proc , vol.5 , Issue.SUPPL. 3
    • Ogutu, J.O.1
  • 32
    • 77954133026 scopus 로고    scopus 로고
    • Estimation of effect size distribution from genome-wide association studies and implications for future discoveries
    • Park, J. et al (2010) Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat. Genet., 42, 570-575.
    • (2010) Nat. Genet. , vol.42 , pp. 570-575
    • Park, J.1
  • 33
    • 81055145433 scopus 로고    scopus 로고
    • Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants
    • Park, J. et al (2011) Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants. Proc. Natl Acad. Sei. U.S.A., 108, 18026-18031.
    • (2011) Proc. Natl Acad. Sei. U.S.A. , vol.108 , pp. 18026-18031
    • Park, J.1
  • 34
    • 79961135005 scopus 로고    scopus 로고
    • R Development Core Team. R Foundation for Statistical Computing, Vienna, Austria
    • R Development Core Team. (2011) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
    • (2011) R: A Language and Environment for Statistical Computing
  • 36
    • 0034973569 scopus 로고    scopus 로고
    • Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer
    • Ritchie, M.D. et al (2001) Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J, Hum. Genet., 69, 138-147.
    • (2001) Am. J, Hum. Genet. , vol.69 , pp. 138-147
    • Ritchie, M.D.1
  • 37
    • 79955984463 scopus 로고    scopus 로고
    • Ranking causal variants and associated regions in genome-wide association studies by the support vector machine and random forest
    • Roshan, U. et al (2011) Ranking causal variants and associated regions in genome-wide association studies by the support vector machine and random forest. Nucleic Acids Res., 39, e62.
    • (2011) Nucleic Acids Res. , vol.39
    • Roshan, U.1
  • 38
    • 53549131556 scopus 로고    scopus 로고
    • A bias correction algorithm for the Gini variable importance measure in classification trees
    • Sandri, M. and Zuccolotto, P. (2008) A bias correction algorithm for the Gini variable importance measure in classification trees. J. Comp. Graph. Stat., 17, 611-628.
    • (2008) J. Comp. Graph. Stat. , vol.17 , pp. 611-628
    • Sandri, M.1    Zuccolotto, P.2
  • 39
    • 77956880851 scopus 로고    scopus 로고
    • Analysis and correction of bias in total decrease in node impurity measures for tree-based algorithms
    • Sandri, M. and Zuccolotto, P. (2010) Analysis and correction of bias in total decrease in node impurity measures for tree-based algorithms. Stat Comput, 20, 393-407.
    • (2010) Stat Comput , vol.20 , pp. 393-407
    • Sandri, M.1    Zuccolotto, P.2
  • 40
    • 84861153094 scopus 로고    scopus 로고
    • Matrix eQTL: Ultra fast eQTL analysis via large matrix operations
    • Shabalin, A.A. (2012) Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics, 28, 1353-1358.
    • (2012) Bioinformatics , vol.28 , pp. 1353-1358
    • Shabalin, A.A.1
  • 41
    • 80051807776 scopus 로고    scopus 로고
    • Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study
    • So, H. et al (2011) Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study. Genet. Epidemiol, 35, 447-456.
    • (2011) Genet. Epidemiol , vol.35 , pp. 447-456
    • So, H.1
  • 42
    • 33847096395 scopus 로고    scopus 로고
    • Bias in random forest variable importance measures: Illustrations, sources and a solution
    • Strobl, C. et al (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinform., 8, 25.
    • (2007) BMC Bioinform. , vol.8 , pp. 25
    • Strobl, C.1
  • 43
    • 48549095457 scopus 로고    scopus 로고
    • Conditional variable importance for random forests
    • Strobl, C. et al (2008) Conditional variable importance for random forests. BMC Bioinform., 9, 307.
    • (2008) BMC Bioinform. , vol.9 , pp. 307
    • Strobl, C.1
  • 44
    • 71249151977 scopus 로고    scopus 로고
    • Machine learning in genome-wide association studies
    • Szymczak, S. et al (2009) Machine learning in genome-wide association studies. Genet. Epidemiol, 33(Suppl 1), S51-S57.
    • (2009) Genet. Epidemiol , vol.33 , Issue.SUPPL. 1
    • Szymczak, S.1
  • 45
    • 85194972808 scopus 로고    scopus 로고
    • Regression shrinkage and selection via the lasso
    • Tibshirani, R (1996) Regression shrinkage and selection via the lasso. J. R. Stat, Soc. Ser. B., 58, 267-288.
    • (1996) J. R. Stat, Soc. Ser. B. , vol.58 , pp. 267-288
    • Tibshirani, R.1
  • 46
    • 71249133809 scopus 로고    scopus 로고
    • Detecting significant single-nucleotide polymorphisms in a rheumatoid arthritis study using random forests
    • Wang, M. et al (2009) Detecting significant single-nucleotide polymorphisms in a rheumatoid arthritis study using random forests. BMC Proc, 3(Suppl 7), S69.
    • (2009) BMC Proc , vol.3 , Issue.SUPPL. 7
    • Wang, M.1
  • 47
    • 13144265739 scopus 로고    scopus 로고
    • Genome-wide association studies: Theoretical and practical concerns
    • Wang, W.Y. et al (2005) Genome-wide association studies: theoretical and practical concerns. Nat. Rev. Genet., 6, 109-118.
    • (2005) Nat. Rev. Genet. , vol.6 , pp. 109-118
    • Wang, W.Y.1
  • 48
    • 80054892937 scopus 로고    scopus 로고
    • An empirical comparison of several recent epistatic interaction detection methods
    • Wang, Y. et al (2011) An empirical comparison of several recent epistatic interaction detection methods. Bioinformatics, 27, 2936-2943.
    • (2011) Bioinformatics , vol.27 , pp. 2936-2943
    • Wang, Y.1
  • 49
    • 77949865384 scopus 로고    scopus 로고
    • Screen and clean: A tool for identifying interactions in genome-wide association studies
    • Wu, J. et al (2010) Screen and clean: a tool for identifying interactions in genome-wide association studies. Genet. Epidemiol, 34, 275-285.
    • (2010) Genet. Epidemiol , vol.34 , pp. 275-285
    • Wu, J.1
  • 50
    • 77954140531 scopus 로고    scopus 로고
    • Common SNPs explain a large proportion of the heritability for human height
    • Yang, J. et al (2010) Common SNPs explain a large proportion of the heritability for human height. Nat. Genet., 42, 565-571.
    • (2010) Nat. Genet. , vol.42 , pp. 565-571
    • Yang, J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.