메뉴 건너뛰기




Volumn 4, Issue 3, 2005, Pages 195-203

Feature selection and the class imbalance problem in predicting protein function from sequence

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHM; AMINO ACID SEQUENCE; ARTICLE; COMPARATIVE STUDY; PERFORMANCE; PREDICTION; PROTEIN FUNCTION; SEQUENCE HOMOLOGY; TRAINING;

EID: 27344448597     PISSN: 11755636     EISSN: 11755636     Source Type: Journal    
DOI: 10.2165/00822942-200504030-00004     Document Type: Article
Times cited : (95)

References (31)
  • 1
    • 0030801002 scopus 로고    scopus 로고
    • Gapped BLAST and PSI-BLAST: A new generation of protein database search programs
    • Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25: 3389-402
    • (1997) Nucleic Acids Res , vol.25 , pp. 3389-3402
    • Altschul, S.F.1    Madden, T.L.2    Schaffer, A.A.3
  • 2
    • 0346799108 scopus 로고    scopus 로고
    • Prediction of protein function from protein sequence and structure
    • Whisstock JC, Lesk AM. Prediction of protein function from protein sequence and structure. Q Rev Biophys 2003; 36 (3): 307-40
    • (2003) Q Rev Biophys , vol.36 , Issue.3 , pp. 307-340
    • Whisstock, J.C.1    Lesk, A.M.2
  • 3
    • 0000875752 scopus 로고    scopus 로고
    • Accurate prediction of protein functional class in the Mycobacterium tuberculosis and Escherichia coli genomes using data mining
    • King RD, Karwath A, Clare A, et al. Accurate prediction of protein functional class in the Mycobacterium tuberculosis and Escherichia coli genomes using data mining. Yeast 2000; 17: 283-93
    • (2000) Yeast , vol.17 , pp. 283-293
    • King, R.D.1    Karwath, A.2    Clare, A.3
  • 4
    • 0037460964 scopus 로고    scopus 로고
    • Prediction of human protein function according to Gene Ontology categories
    • Jensen R, Gupta H, Staerfeldt H, et al. Prediction of human protein function according to Gene Ontology categories. Bioinformatics 2003; 19: 635-42
    • (2003) Bioinformatics , vol.19 , pp. 635-642
    • Jensen, R.1    Gupta, H.2    Staerfeldt, H.3
  • 5
    • 33845536164 scopus 로고    scopus 로고
    • The class imbalance problem: A systematic study
    • Japkowicz N, Stephen S. The class imbalance problem: a systematic study. Intell Data Anal J 2002; 6 (5): 429-49
    • (2002) Intell Data Anal J , vol.6 , Issue.5 , pp. 429-449
    • Japkowicz, N.1    Stephen, S.2
  • 6
    • 27344432474 scopus 로고    scopus 로고
    • C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling
    • Jul 21; Washington, DC
    • Drummond C, Holte RC. C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. Workshop on Learning from Imbalanced Datasets II; 2003 Jul 21; Washington, DC
    • (2003) Workshop on Learning from Imbalanced Datasets II
    • Drummond, C.1    Holte, R.C.2
  • 7
    • 1442356040 scopus 로고    scopus 로고
    • A multiple resampling method for learning from imbalanced data sets
    • Estabrooks A, Jo T, Japkowicz N. A multiple resampling method for learning from imbalanced data sets. Comput Intell 2004; 20 (1): 18-36
    • (2004) Comput Intell , vol.20 , Issue.1 , pp. 18-36
    • Estabrooks, A.1    Jo, T.2    Japkowicz, N.3
  • 12
    • 0032942501 scopus 로고    scopus 로고
    • Neisseria gonorrhoeae PilA is an FtsY homolog
    • Arvidson CG, Powers T, Walter P, et al. Neisseria gonorrhoeae PilA is an FtsY homolog. J Bacteriol 1999; 181: 731-9
    • (1999) J Bacteriol , vol.181 , pp. 731-739
    • Arvidson, C.G.1    Powers, T.2    Walter, P.3
  • 13
    • 0027133165 scopus 로고
    • Functions of the gene products of Escherichia coli
    • Riley M. Functions of the gene products of Escherichia coli. Microbiol Rev 1993; 57: 862-952
    • (1993) Microbiol Rev , vol.57 , pp. 862-952
    • Riley, M.1
  • 15
    • 0030884601 scopus 로고    scopus 로고
    • A comparative study of duplications in bacteria and eukaryotes: The importance of telomeres
    • Coissac E, Maillier E, Netter P. A comparative study of duplications in bacteria and eukaryotes: the importance of telomeres. Mol Biol Evol 1997; 14 (10): 1062-74
    • (1997) Mol Biol Evol , vol.14 , Issue.10 , pp. 1062-1074
    • Coissac, E.1    Maillier, E.2    Netter, P.3
  • 17
    • 0003120218 scopus 로고    scopus 로고
    • Fast training of support vector machines using sequential minimal optimisation
    • Scholkopf B, Burges C, Smola A, editors. Cambridge (MA): MIT Press
    • Platt JC. Fast training of support vector machines using sequential minimal optimisation. In: Scholkopf B, Burges C, Smola A, editors. Advances in kernel methods: support vector learning. Cambridge (MA): MIT Press, 1999: 185-208
    • (1999) Advances in Kernel Methods: Support Vector Learning , pp. 185-208
    • Platt, J.C.1
  • 19
    • 0013326060 scopus 로고    scopus 로고
    • Feature selection for classification
    • Dash M, Liu H. Feature selection for classification. Intell Data Anal J 1997; 1 (3): 131-56
    • (1997) Intell Data Anal J , vol.1 , Issue.3 , pp. 131-156
    • Dash, M.1    Liu, H.2
  • 20
    • 0031381525 scopus 로고    scopus 로고
    • Wrappers for feature subset selection
    • Kohavi R, John GH. Wrappers for feature subset selection. Artif Intell 1997; 97 (1-2): 273-324
    • (1997) Artif Intell , vol.97 , Issue.1-2 , pp. 273-324
    • Kohavi, R.1    John, G.H.2
  • 24
    • 27344449159 scopus 로고    scopus 로고
    • An assessment of feature relevance in predicting protein function from sequence
    • Proceedings of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL' 04). Exeter: Springer-Verlag
    • Al-Shahib A, He C, Tan AC, et al. An assessment of feature relevance in predicting protein function from sequence. In: Proceedings of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL' 04). Lecture Notes in Computer Science. Volume 3177. Exeter: Springer-Verlag, 2004: 52-7
    • (2004) Lecture Notes in Computer Science , vol.3177 , pp. 52-57
    • Al-Shahib, A.1    He, C.2    Tan, A.C.3
  • 25
    • 49549139345 scopus 로고
    • The area above the ordinal dominance graph and the area below the receiver operating characteristic graph
    • Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 1975; 12: 387-415
    • (1975) J Math Psychol , vol.12 , pp. 387-415
    • Bamber, D.1
  • 26
    • 0001969211 scopus 로고    scopus 로고
    • Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching
    • Gribskov M, Robinson NL. Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching. Comput Chem 1996; 20: 25-33
    • (1996) Comput Chem , vol.20 , pp. 25-33
    • Gribskov, M.1    Robinson, N.L.2
  • 28
    • 33745561205 scopus 로고    scopus 로고
    • An introduction to variable and feature selection
    • Guyon I, Gupta H. An introduction to variable and feature selection. J Mach Learn Res 2003; 3: 1157-82
    • (2003) J Mach Learn Res , vol.3 , pp. 1157-1182
    • Guyon, I.1    Gupta, H.2
  • 29
    • 2942594395 scopus 로고    scopus 로고
    • Fast feature selection using a simple estimation of distribution algorithm: A case study on splice site prediction
    • Saeys Y, Degroeve S, Aeyels D, et al. Fast feature selection using a simple estimation of distribution algorithm: a case study on splice site prediction. Bioinformatics 2003; 19 Suppl. 2: ii179-88
    • (2003) Bioinformatics , vol.19 , Issue.2 SUPPL.
    • Saeys, Y.1    Degroeve, S.2    Aeyels, D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.