메뉴 건너뛰기




Volumn 422, Issue 1, 2012, Pages 36-41

The elusive short gene - an ensemble method for recognition for prokaryotic genome

Author keywords

Adaboost.M1; Computational gene finding; Ensemble classifier; Feature selection; Random forests; Short gene prediction

Indexed keywords

ARTICLE; BACTERIAL GENOME; CLASSIFIER; CODON USAGE; DNA BASE COMPOSITION; ENTEROBACTER; ESCHERICHIA COLI; FAST CORRELATION BASED FEATURE SELECTION; GENE STRUCTURE; GENETIC CODE; KLEBSIELLA PNEUMONIAE; MOLECULAR RECOGNITION; NONHUMAN; PRIORITY JOURNAL; SHORT GENE; YERSINIA PESTIS;

EID: 84861460183     PISSN: 0006291X     EISSN: 10902104     Source Type: Journal    
DOI: 10.1016/j.bbrc.2012.04.090     Document Type: Article
Times cited : (10)

References (59)
  • 1
    • 0033986936 scopus 로고    scopus 로고
    • Transcription attenuation: once viewed as a novel regulatory strategy
    • Yanofsky C. Transcription attenuation: once viewed as a novel regulatory strategy. Journal of Bacteriology 2000, 182:1-8.
    • (2000) Journal of Bacteriology , vol.182 , pp. 1-8
    • Yanofsky, C.1
  • 2
    • 0142058024 scopus 로고    scopus 로고
    • Endogenous production of antimicrobial peptides in innate immunity and human disease
    • Gallo R.L., Nizet V. Endogenous production of antimicrobial peptides in innate immunity and human disease. Current Allergy and Asthma Reports 2003, 3:402-409.
    • (2003) Current Allergy and Asthma Reports , vol.3 , pp. 402-409
    • Gallo, R.L.1    Nizet, V.2
  • 3
    • 33846796241 scopus 로고    scopus 로고
    • PetG and PetN, but not PetL, are essential subunits of the cytochrome b6f complex from Synechocystis PCC 6803
    • Schneider D., Volkmer T., Rogner M. PetG and PetN, but not PetL, are essential subunits of the cytochrome b6f complex from Synechocystis PCC 6803. Research in Microbiology 2007, 158:45-50.
    • (2007) Research in Microbiology , vol.158 , pp. 45-50
    • Schneider, D.1    Volkmer, T.2    Rogner, M.3
  • 4
    • 48949098526 scopus 로고    scopus 로고
    • Hepcidin the discovery of a small protein with a pivotal role in iron homeostasis
    • 276
    • Sela B.A. Hepcidin the discovery of a small protein with a pivotal role in iron homeostasis. Harefuah 2008, 147:261-266, 276.
    • (2008) Harefuah , vol.147 , pp. 261-266
    • Sela, B.A.1
  • 5
    • 56749151013 scopus 로고    scopus 로고
    • Small membrane proteins found by comparative genomics and ribosome binding site models
    • Hemm M.R., Paul B.J., Schneider T.D., Storz G., Rudd K.E. Small membrane proteins found by comparative genomics and ribosome binding site models. Molecular Microbiology 2008, 70:1487-1501.
    • (2008) Molecular Microbiology , vol.70 , pp. 1487-1501
    • Hemm, M.R.1    Paul, B.J.2    Schneider, T.D.3    Storz, G.4    Rudd, K.E.5
  • 8
    • 0019542835 scopus 로고
    • Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification
    • Shepherd J.C. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proceedings of the National Academy of Sciences of the United States of America 1981, 78:1596-1600.
    • (1981) Proceedings of the National Academy of Sciences of the United States of America , vol.78 , pp. 1596-1600
    • Shepherd, J.C.1
  • 9
    • 0020039567 scopus 로고
    • Codon preference and its use in identifying protein coding regions in long DNA sequences
    • Staden R., McLachlan A.D. Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Research 1982, 10:141-156.
    • (1982) Nucleic Acids Research , vol.10 , pp. 141-156
    • Staden, R.1    McLachlan, A.D.2
  • 10
    • 0020480512 scopus 로고
    • Recognition of protein coding regions in DNA sequences
    • Fickett J.W. Recognition of protein coding regions in DNA sequences. Nucleic Acids Research 1982, 10:5303-5318.
    • (1982) Nucleic Acids Research , vol.10 , pp. 5303-5318
    • Fickett, J.W.1
  • 11
    • 0022578582 scopus 로고
    • Probability of coding of a DNA sequence: an algorithm to predict translated reading frames from their thermodynamic characteristics
    • Tramontano A., Macchiato M.F. Probability of coding of a DNA sequence: an algorithm to predict translated reading frames from their thermodynamic characteristics. Nucleic Acids Research 1986, 14:127-135.
    • (1986) Nucleic Acids Research , vol.14 , pp. 127-135
    • Tramontano, A.1    Macchiato, M.F.2
  • 12
    • 0023723196 scopus 로고
    • Oligopeptide biases in protein sequences and their use in predicting protein coding regions in nucleotide sequences
    • McCaldon P., Argos P. Oligopeptide biases in protein sequences and their use in predicting protein coding regions in nucleotide sequences. Proteins 1988, 4:99-122.
    • (1988) Proteins , vol.4 , pp. 99-122
    • McCaldon, P.1    Argos, P.2
  • 13
    • 0034662286 scopus 로고    scopus 로고
    • Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve
    • Zhang C.T., Wang J. Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve. Nucleic Acids Research 2000, 28:2804-2814.
    • (2000) Nucleic Acids Research , vol.28 , pp. 2804-2814
    • Zhang, C.T.1    Wang, J.2
  • 14
    • 0034830866 scopus 로고    scopus 로고
    • Identification of protein-coding genes in the genome of Vibrio cholerae with more than 98% accuracy using occurrence frequencies of single nucleotides
    • Wang J., Zhang C.T. Identification of protein-coding genes in the genome of Vibrio cholerae with more than 98% accuracy using occurrence frequencies of single nucleotides. European Journal of Biochemistry 2001, 268:4261-4268.
    • (2001) European Journal of Biochemistry , vol.268 , pp. 4261-4268
    • Wang, J.1    Zhang, C.T.2
  • 15
    • 9644303060 scopus 로고    scopus 로고
    • A fractal method to distinguish coding and non-coding sequences in a complete genome based on a number sequence representation
    • Zhou L.Q., Yu Z.G., Deng J.Q., Anh V., Long S.C. A fractal method to distinguish coding and non-coding sequences in a complete genome based on a number sequence representation. Journal of Theoretical Biology 2005, 232:559-567.
    • (2005) Journal of Theoretical Biology , vol.232 , pp. 559-567
    • Zhou, L.Q.1    Yu, Z.G.2    Deng, J.Q.3    Anh, V.4    Long, S.C.5
  • 16
    • 27544514380 scopus 로고    scopus 로고
    • Identification of coding and non-coding sequences using local Holder exponent formalism
    • Kulkarni O.C., Vigneshwar R., Jayaraman V.K., Kulkarni B.D. Identification of coding and non-coding sequences using local Holder exponent formalism. Bioinformatics 2005, 21:3818-3823.
    • (2005) Bioinformatics , vol.21 , pp. 3818-3823
    • Kulkarni, O.C.1    Vigneshwar, R.2    Jayaraman, V.K.3    Kulkarni, B.D.4
  • 17
    • 0032519353 scopus 로고    scopus 로고
    • GeneMark.hmm: new solutions for gene finding
    • Lukashin A.V., Borodovsky M. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Research 1998, 26:1107-1115.
    • (1998) Nucleic Acids Research , vol.26 , pp. 1107-1115
    • Lukashin, A.V.1    Borodovsky, M.2
  • 18
    • 0032415865 scopus 로고    scopus 로고
    • How to interpret an anonymous bacterial genome: machine learning approach to gene identification
    • Hayes W.S., Borodovsky M. How to interpret an anonymous bacterial genome: machine learning approach to gene identification. Genome Research 1998, 8:1154-1171.
    • (1998) Genome Research , vol.8 , pp. 1154-1171
    • Hayes, W.S.1    Borodovsky, M.2
  • 19
    • 0032526322 scopus 로고    scopus 로고
    • Combining diverse evidence for gene recognition in completely sequenced bacterial genomes
    • Frishman D., Mironov A., Mewes H.W., Gelfand M. Combining diverse evidence for gene recognition in completely sequenced bacterial genomes. Nucleic Acids Research 1998, 26:2941-2947.
    • (1998) Nucleic Acids Research , vol.26 , pp. 2941-2947
    • Frishman, D.1    Mironov, A.2    Mewes, H.W.3    Gelfand, M.4
  • 20
    • 0035875343 scopus 로고    scopus 로고
    • GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions
    • Besemer J., Lomsadze A., Borodovsky M. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Research 2001, 29:2607-2618.
    • (2001) Nucleic Acids Research , vol.29 , pp. 2607-2618
    • Besemer, J.1    Lomsadze, A.2    Borodovsky, M.3
  • 21
    • 0026623434 scopus 로고
    • The prediction of exons through an analysis of spliceable open reading frames
    • Hutchinson G.B., Hayden M.R. The prediction of exons through an analysis of spliceable open reading frames. Nucleic Acids Research 1992, 20:3453-3462.
    • (1992) Nucleic Acids Research , vol.20 , pp. 3453-3462
    • Hutchinson, G.B.1    Hayden, M.R.2
  • 23
    • 34147174475 scopus 로고    scopus 로고
    • MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes
    • Zhu H., Hu G.Q., Yang Y.F., Wang J., She Z.S. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes. BMC Bioinformatics 2007, 8:97.
    • (2007) BMC Bioinformatics , vol.8 , pp. 97
    • Zhu, H.1    Hu, G.Q.2    Yang, Y.F.3    Wang, J.4    She, Z.S.5
  • 27
    • 0030905954 scopus 로고    scopus 로고
    • Base-base and deoxyribose-base stacking interactions in B-DNA and Z-DNA: a quantum-chemical study
    • Sponer J., Gabb H.A., Leszczynski J., Hobza P. Base-base and deoxyribose-base stacking interactions in B-DNA and Z-DNA: a quantum-chemical study. Biophysical Journal 1997, 73:76-87.
    • (1997) Biophysical Journal , vol.73 , pp. 76-87
    • Sponer, J.1    Gabb, H.A.2    Leszczynski, J.3    Hobza, P.4
  • 28
    • 84985667623 scopus 로고
    • Stabilities of nearest-neighbor doublets in double-helical DNA determined by fitting calculated melting profiles to observed profiles
    • Gotoh O., Tagashira Y. Stabilities of nearest-neighbor doublets in double-helical DNA determined by fitting calculated melting profiles to observed profiles. Biopolymers 1981, 20:1033-1042.
    • (1981) Biopolymers , vol.20 , pp. 1033-1042
    • Gotoh, O.1    Tagashira, Y.2
  • 29
    • 0023204437 scopus 로고
    • Importance of DNA stiffness in protein-DNA binding specificity
    • Hogan M.E., Austin R.H. Importance of DNA stiffness in protein-DNA binding specificity. Nature 1987, 329:263-266.
    • (1987) Nature , vol.329 , pp. 263-266
    • Hogan, M.E.1    Austin, R.H.2
  • 30
    • 0024300174 scopus 로고
    • DNA sequence determinants of CAP-induced bending and protein binding affinity
    • Gartenberg M.R., Crothers D.M. DNA sequence determinants of CAP-induced bending and protein binding affinity. Nature 1988, 333:824-829.
    • (1988) Nature , vol.333 , pp. 824-829
    • Gartenberg, M.R.1    Crothers, D.M.2
  • 31
    • 0029859713 scopus 로고    scopus 로고
    • Improved thermodynamic parameters and helix initiation factor to predict stability of DNA duplexes
    • Sugimoto N., Nakano S., Yoneyama M., Honda K. Improved thermodynamic parameters and helix initiation factor to predict stability of DNA duplexes. Nucleic Acids Research 1996, 24:4501-4505.
    • (1996) Nucleic Acids Research , vol.24 , pp. 4501-4505
    • Sugimoto, N.1    Nakano, S.2    Yoneyama, M.3    Honda, K.4
  • 33
    • 40449135396 scopus 로고    scopus 로고
    • Determining promoter location based on DNA structure first-principles calculations
    • Goni J.R., Perez A., Torrents D., Orozco M. Determining promoter location based on DNA structure first-principles calculations. Genome Biology 2007, 8:R263.
    • (2007) Genome Biology , vol.8
    • Goni, J.R.1    Perez, A.2    Torrents, D.3    Orozco, M.4
  • 34
  • 35
    • 0030448031 scopus 로고    scopus 로고
    • Combining structural analysis of DNA with search routines for the detection of transcription regulatory elements
    • Karas H., Knuppel R., Schulz W., Sklenar H., Wingender E. Combining structural analysis of DNA with search routines for the detection of transcription regulatory elements. Computer Applications in the Biosciences 1996, 12:441-446.
    • (1996) Computer Applications in the Biosciences , vol.12 , pp. 441-446
    • Karas, H.1    Knuppel, R.2    Schulz, W.3    Sklenar, H.4    Wingender, E.5
  • 37
    • 0030048596 scopus 로고    scopus 로고
    • Role of base-backbone and base-base interactions in alternating DNA conformations
    • Suzuki M., Yagi N., Finch J.T. Role of base-backbone and base-base interactions in alternating DNA conformations. FEBS Letters 1996, 379:148-152.
    • (1996) FEBS Letters , vol.379 , pp. 148-152
    • Suzuki, M.1    Yagi, N.2    Finch, J.T.3
  • 38
    • 0242290881 scopus 로고    scopus 로고
    • DNA basepair step deformability inferred from molecular dynamics simulations
    • Lankas F., Sponer J., Langowski J., Cheatham T.E. DNA basepair step deformability inferred from molecular dynamics simulations. Biophysical Journal 2003, 85:2872-2883.
    • (2003) Biophysical Journal , vol.85 , pp. 2872-2883
    • Lankas, F.1    Sponer, J.2    Langowski, J.3    Cheatham, T.E.4
  • 41
    • 33847329566 scopus 로고    scopus 로고
    • In search of the small ones: improved prediction of short exons in vertebrates, plants, fungi and protists
    • Saeys Y., Rouze P., Van De Peer Y. In search of the small ones: improved prediction of short exons in vertebrates, plants, fungi and protists. Bioinformatics 2006, 23:414-420.
    • (2006) Bioinformatics , vol.23 , pp. 414-420
    • Saeys, Y.1    Rouze, P.2    Van De Peer, Y.3
  • 44
    • 13544254591 scopus 로고    scopus 로고
    • Relationship between gene expression and GC-content in mammals: statistical significance and biological relevance
    • Semon M., Mouchiroud D., Duret L. Relationship between gene expression and GC-content in mammals: statistical significance and biological relevance. Human Molecular Genetics 2005, 14:421-427.
    • (2005) Human Molecular Genetics , vol.14 , pp. 421-427
    • Semon, M.1    Mouchiroud, D.2    Duret, L.3
  • 45
    • 0031586003 scopus 로고    scopus 로고
    • Prediction of complete gene structures in human genomic DNA
    • Burge C., Karlin S. Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology 1997, 268:78-94.
    • (1997) Journal of Molecular Biology , vol.268 , pp. 78-94
    • Burge, C.1    Karlin, S.2
  • 47
    • 0019824131 scopus 로고
    • Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system
    • Ikemura T. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. Journal of Molecular Biology 1981, 151:389-409.
    • (1981) Journal of Molecular Biology , vol.151 , pp. 389-409
    • Ikemura, T.1
  • 48
    • 0023650543 scopus 로고
    • The codon Adaptation Index-a measure of directional synonymous codon usage bias, and its potential applications
    • Sharp P.M., Li W.H. The codon Adaptation Index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research 1987, 15:1281-1295.
    • (1987) Nucleic Acids Research , vol.15 , pp. 1281-1295
    • Sharp, P.M.1    Li, W.H.2
  • 49
    • 0242391986 scopus 로고    scopus 로고
    • Codon adaptation index as a measure of dominating codon bias
    • Carbone A., Zinovyev A., Kepes F. Codon adaptation index as a measure of dominating codon bias. Bioinformatics 2003, 19:2005-2015.
    • (2003) Bioinformatics , vol.19 , pp. 2005-2015
    • Carbone, A.1    Zinovyev, A.2    Kepes, F.3
  • 50
    • 0020475449 scopus 로고
    • A simple method for displaying the hydropathic character of a protein
    • Kyte J., Doolittle R.F. A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology 1982, 157:105-132.
    • (1982) Journal of Molecular Biology , vol.157 , pp. 105-132
    • Kyte, J.1    Doolittle, R.F.2
  • 52
  • 53
    • 2942541354 scopus 로고    scopus 로고
    • Feature selection for splice site prediction: a new method using EDA-based feature ranking
    • Saeys Y., Degroeve S., Aeyels D., Rouze P., Van de Peer Y. Feature selection for splice site prediction: a new method using EDA-based feature ranking. BMC Bioinformatics 2004, 5:64.
    • (2004) BMC Bioinformatics , vol.5 , pp. 64
    • Saeys, Y.1    Degroeve, S.2    Aeyels, D.3    Rouze, P.4    Van de Peer, Y.5
  • 54
    • 1942451938 scopus 로고    scopus 로고
    • Feature selection for high-dimensional data: a fast correlation-based filter solution
    • Yu L., Liu H. Feature selection for high-dimensional data: a fast correlation-based filter solution. ICML 2003, 856-863.
    • (2003) ICML , pp. 856-863
    • Yu, L.1    Liu, H.2
  • 55
    • 0031361611 scopus 로고    scopus 로고
    • Machine-learning research: four current directions
    • Dietterich T. Machine-learning research: four current directions. AI Magazine 1997, 18:97-136.
    • (1997) AI Magazine , vol.18 , pp. 97-136
    • Dietterich, T.1
  • 56
    • 0030211964 scopus 로고    scopus 로고
    • Bagging Predictors
    • Breiman L. Bagging Predictors. Machine Learning 1996, 24:123-140.
    • (1996) Machine Learning , vol.24 , pp. 123-140
    • Breiman, L.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.