메뉴 건너뛰기




Volumn 15, Issue 1, 2014, Pages

Assessment and refinement of eukaryotic gene structure prediction with gene-structure-aware multiple protein sequence alignment

Author keywords

Cytochrome P450; Gene prediction; Gene structure; Genome annotation; Multiple sequence alignment; Ribosomal proteins; Spliced alignment

Indexed keywords

AMINO ACIDS; FORECASTING; ITERATIVE METHODS; PROTEINS;

EID: 84902051183     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/1471-2105-15-189     Document Type: Article
Times cited : (33)

References (58)
  • 1
    • 71049176982 scopus 로고    scopus 로고
    • Genome 10 K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species
    • 2877544, 19892720
    • Haussler D, O'Brien SJ, Ryder OA. Genome 10 K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. J Hered 2009, 100:659-674. 2877544, 19892720.
    • (2009) J Hered , vol.100 , pp. 659-674
    • Haussler, D.1    O'Brien, S.J.2    Ryder, O.A.3
  • 3
    • 84904728421 scopus 로고    scopus 로고
    • 3-Million Genomes Project
    • 3-Million Genomes Project. http://www.nationalgenebank.org/en/research.html.
  • 5
    • 42949086676 scopus 로고    scopus 로고
    • Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments
    • 10.1186/gb-2008-9-1-r7, 2395244, 18190707
    • Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome biology 2008, 9:R7. 10.1186/gb-2008-9-1-r7, 2395244, 18190707.
    • (2008) Genome biology , vol.9
    • Haas, B.J.1    Salzberg, S.L.2    Zhu, W.3    Pertea, M.4    Allen, J.E.5    Orvis, J.6    White, O.7    Buell, C.R.8    Wortman, J.R.9
  • 8
    • 84856090271 scopus 로고    scopus 로고
    • PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments
    • 10.1093/bioinformatics/btr638, 22101153
    • Jones DT, Buchan DW, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 2012, 28:184-190. 10.1093/bioinformatics/btr638, 22101153.
    • (2012) Bioinformatics , vol.28 , pp. 184-190
    • Jones, D.T.1    Buchan, D.W.2    Cozzetto, D.3    Pontil, M.4
  • 9
    • 84884603324 scopus 로고    scopus 로고
    • Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era
    • 10.1073/pnas.1314045110, 3785744, 24009338
    • Kamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci U S A 2013, 110:15674-15679. 10.1073/pnas.1314045110, 3785744, 24009338.
    • (2013) Proc Natl Acad Sci U S A , vol.110 , pp. 15674-15679
    • Kamisetty, H.1    Ovchinnikov, S.2    Baker, D.3
  • 10
    • 0027136282 scopus 로고
    • Comparative protein modelling by satisfaction of spatial restraints
    • 10.1006/jmbi.1993.1626, 8254673
    • Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 1993, 234:779-815. 10.1006/jmbi.1993.1626, 8254673.
    • (1993) J Mol Biol , vol.234 , pp. 779-815
    • Sali, A.1    Blundell, T.L.2
  • 11
  • 12
    • 84859898660 scopus 로고    scopus 로고
    • A beginner's guide to eukaryotic genome annotation
    • 10.1038/nrg3174, 22510764
    • Yandell M, Ence D. A beginner's guide to eukaryotic genome annotation. Nat Rev Genet 2012, 13:329-342. 10.1038/nrg3174, 22510764.
    • (2012) Nat Rev Genet , vol.13 , pp. 329-342
    • Yandell, M.1    Ence, D.2
  • 14
    • 68149134694 scopus 로고    scopus 로고
    • The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes
    • 10.1186/1746-4811-5-8, 2705364, 19545381
    • Estill JC, Bennetzen JL. The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes. Plant methods 2009, 5:8. 10.1186/1746-4811-5-8, 2705364, 19545381.
    • (2009) Plant methods , vol.5 , pp. 8
    • Estill, J.C.1    Bennetzen, J.L.2
  • 15
    • 84859729407 scopus 로고    scopus 로고
    • Origin and evolution of spliceosomal introns
    • 10.1186/1745-6150-7-11, 3488318, 22507701
    • Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biology direct 2012, 7:11. 10.1186/1745-6150-7-11, 3488318, 22507701.
    • (2012) Biology direct , vol.7 , pp. 11
    • Rogozin, I.B.1    Carmel, L.2    Csuros, M.3    Koonin, E.V.4
  • 16
    • 84904767186 scopus 로고    scopus 로고
    • New York: Humana Press - Springer, Methods in Molecular Biology, Volume 1079
    • Russell DJ. Multiple sequence alignment methods 2013, New York: Humana Press - Springer, Methods in Molecular Biology, Volume 1079.
    • (2013) Multiple sequence alignment methods
    • Russell, D.J.1
  • 17
    • 0030582739 scopus 로고    scopus 로고
    • Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments
    • 10.1006/jmbi.1996.0679, 8980688
    • Gotoh O. Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J Mol Biol 1996, 264:823-838. 10.1006/jmbi.1996.0679, 8980688.
    • (1996) J Mol Biol , vol.264 , pp. 823-838
    • Gotoh, O.1
  • 18
    • 0032789448 scopus 로고    scopus 로고
    • Multiple sequence alignment: algorithms and applications
    • Gotoh O. Multiple sequence alignment: algorithms and applications. Adv Biophys 1999, 36:159-206.
    • (1999) Adv Biophys , vol.36 , pp. 159-206
    • Gotoh, O.1
  • 19
    • 0034129662 scopus 로고    scopus 로고
    • Homology-based gene structure prediction: simplified matching algorithm using a translated codon (tron) and improved accuracy by allowing for long gaps
    • 10.1093/bioinformatics/16.3.190, 10869012
    • Gotoh O. Homology-based gene structure prediction: simplified matching algorithm using a translated codon (tron) and improved accuracy by allowing for long gaps. Bioinformatics 2000, 16:190-202. 10.1093/bioinformatics/16.3.190, 10869012.
    • (2000) Bioinformatics , vol.16 , pp. 190-202
    • Gotoh, O.1
  • 20
    • 0028041177 scopus 로고
    • Further improvement in methods of group-to-group sequence alignment with generalized profile operations
    • Gotoh O. Further improvement in methods of group-to-group sequence alignment with generalized profile operations. Comput Appl Biosci 1994, 10:379-387.
    • (1994) Comput Appl Biosci , vol.10 , pp. 379-387
    • Gotoh, O.1
  • 21
    • 0023375315 scopus 로고
    • Profile analysis: detection of distantly related proteins
    • 10.1073/pnas.84.13.4355, 305087, 3474607
    • Gribskov M, McLachlan AD, Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci U S A 1987, 84:4355-4358. 10.1073/pnas.84.13.4355, 305087, 3474607.
    • (1987) Proc Natl Acad Sci U S A , vol.84 , pp. 4355-4358
    • Gribskov, M.1    McLachlan, A.D.2    Eisenberg, D.3
  • 23
    • 0030801002 scopus 로고    scopus 로고
    • Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    • 10.1093/nar/25.17.3389, 146917, 9254694
    • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25:3389-3402. 10.1093/nar/25.17.3389, 146917, 9254694.
    • (1997) Nucleic Acids Res , vol.25 , pp. 3389-3402
    • Altschul, S.F.1    Madden, T.L.2    Schaffer, A.A.3    Zhang, J.4    Zhang, Z.5    Miller, W.6    Lipman, D.J.7
  • 24
    • 0033578684 scopus 로고    scopus 로고
    • Protein secondary structure prediction based on position-specific scoring matrices
    • 10.1006/jmbi.1999.3091, 10493868
    • Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 1999, 292:195-202. 10.1006/jmbi.1999.3091, 10493868.
    • (1999) J Mol Biol , vol.292 , pp. 195-202
    • Jones, D.T.1
  • 25
    • 79958058080 scopus 로고    scopus 로고
    • Protein sequence comparison and fold recognition: progress and good-practice benchmarking
    • 10.1016/j.sbi.2011.03.005, 21458982
    • Soding J, Remmert M. Protein sequence comparison and fold recognition: progress and good-practice benchmarking. Curr Opin Struct Biol 2011, 21:404-411. 10.1016/j.sbi.2011.03.005, 21458982.
    • (2011) Curr Opin Struct Biol , vol.21 , pp. 404-411
    • Soding, J.1    Remmert, M.2
  • 26
    • 13744252890 scopus 로고    scopus 로고
    • MAFFT version 5: improvement in accuracy of multiple sequence alignment
    • 10.1093/nar/gki198, 548345, 15661851
    • Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 2005, 33:511-518. 10.1093/nar/gki198, 548345, 15661851.
    • (2005) Nucleic Acids Res , vol.33 , pp. 511-518
    • Katoh, K.1    Kuma, K.2    Toh, H.3    Miyata, T.4
  • 27
    • 84869023660 scopus 로고    scopus 로고
    • Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features
    • 10.1093/nar/gks708, 3488211, 22848105
    • Iwata H, Gotoh O. Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features. Nucleic Acids Res 2012, 40:e161. 10.1093/nar/gks708, 3488211, 22848105.
    • (2012) Nucleic Acids Res , vol.40
    • Iwata, H.1    Gotoh, O.2
  • 28
    • 2442713832 scopus 로고    scopus 로고
    • GeneWise and Genomewise
    • 10.1101/gr.1865504, 479130, 15123596
    • Birney E, Clamp M, Durbin R. GeneWise and Genomewise. Genome Res 2004, 14:988-995. 10.1101/gr.1865504, 479130, 15123596.
    • (2004) Genome Res , vol.14 , pp. 988-995
    • Birney, E.1    Clamp, M.2    Durbin, R.3
  • 29
    • 1342263801 scopus 로고    scopus 로고
    • Gene structure conservation aids similarity based gene prediction
    • 10.1093/nar/gkh211, 373336, 14764925
    • Meyer IM, Durbin R. Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res 2004, 32:776-783. 10.1093/nar/gkh211, 373336, 14764925.
    • (2004) Nucleic Acids Res , vol.32 , pp. 776-783
    • Meyer, I.M.1    Durbin, R.2
  • 30
    • 54949090429 scopus 로고    scopus 로고
    • Direct mapping and alignment of protein sequences onto genomic sequence
    • 10.1093/bioinformatics/btn460, 18728043
    • Gotoh O. Direct mapping and alignment of protein sequences onto genomic sequence. Bioinformatics 2008, 24:2438-2444. 10.1093/bioinformatics/btn460, 18728043.
    • (2008) Bioinformatics , vol.24 , pp. 2438-2444
    • Gotoh, O.1
  • 31
    • 33748674197 scopus 로고    scopus 로고
    • AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome
    • 11-18, 1810548, 16925833
    • Stanke M, Tzvetkova A, Morgenstern B. AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biol 2006, 7(1):S11. 11-18, 1810548, 16925833.
    • (2006) Genome Biol , vol.7 , Issue.1
    • Stanke, M.1    Tzvetkova, A.2    Morgenstern, B.3
  • 32
    • 24644485994 scopus 로고    scopus 로고
    • JIGSAW: integration of multiple sources of evidence for gene prediction
    • 10.1093/bioinformatics/bti609, 16076884
    • Allen JE, Salzberg SL. JIGSAW: integration of multiple sources of evidence for gene prediction. Bioinformatics 2005, 21:3596-3603. 10.1093/bioinformatics/bti609, 16076884.
    • (2005) Bioinformatics , vol.21 , pp. 3596-3603
    • Allen, J.E.1    Salzberg, S.L.2
  • 33
    • 28844443988 scopus 로고    scopus 로고
    • Species-specific variation of alternative splicing and transcriptional initiation in six eukaryotes
    • Nagasaki H, Arita M, Nishizawa T, Suwa M, Gotoh O. Species-specific variation of alternative splicing and transcriptional initiation in six eukaryotes. Gene 2005, 364:53-62.
    • (2005) Gene , vol.364 , pp. 53-62
    • Nagasaki, H.1    Arita, M.2    Nishizawa, T.3    Suwa, M.4    Gotoh, O.5
  • 34
    • 33646484347 scopus 로고    scopus 로고
    • Genomewide comparative analysis of alternative splicing in plants
    • 10.1073/pnas.0602039103, 1459036, 16632598
    • Wang BB, Brendel V. Genomewide comparative analysis of alternative splicing in plants. Proc Natl Acad Sci U S A 2006, 103:7175-7180. 10.1073/pnas.0602039103, 1459036, 16632598.
    • (2006) Proc Natl Acad Sci U S A , vol.103 , pp. 7175-7180
    • Wang, B.B.1    Brendel, V.2
  • 37
    • 70349645555 scopus 로고    scopus 로고
    • Evidence-based gene predictions in plant genomes
    • 10.1101/gr.088997.108, 2765265, 19541913
    • Liang C, Mao L, Ware D, Stein L. Evidence-based gene predictions in plant genomes. Genome Res 2009, 19:1912-1923. 10.1101/gr.088997.108, 2765265, 19541913.
    • (2009) Genome Res , vol.19 , pp. 1912-1923
    • Liang, C.1    Mao, L.2    Ware, D.3    Stein, L.4
  • 40
    • 84865790047 scopus 로고    scopus 로고
    • An integrated encyclopedia of DNA elements in the human genome
    • 10.1038/nature11247, 3439153, 22955616
    • Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. An integrated encyclopedia of DNA elements in the human genome. Nature 2012, 489:57-74. 10.1038/nature11247, 3439153, 22955616.
    • (2012) Nature , vol.489 , pp. 57-74
    • Bernstein, B.E.1    Birney, E.2    Dunham, I.3    Green, E.D.4    Gunter, C.5    Snyder, M.6
  • 41
    • 52449084495 scopus 로고    scopus 로고
    • Identification and correction of abnormal, incomplete and mispredicted proteins in public databases
    • 10.1186/1471-2105-9-353, 2542381, 18752676
    • Nagy A, Hegyi H, Farkas K, Tordai H, Kozma E, Banyai L, Patthy L. Identification and correction of abnormal, incomplete and mispredicted proteins in public databases. BMC Bioinformatics 2008, 9:353. 10.1186/1471-2105-9-353, 2542381, 18752676.
    • (2008) BMC Bioinformatics , vol.9 , pp. 353
    • Nagy, A.1    Hegyi, H.2    Farkas, K.3    Tordai, H.4    Kozma, E.5    Banyai, L.6    Patthy, L.7
  • 42
    • 84885938116 scopus 로고    scopus 로고
    • MisPred: a resource for identification of erroneous protein sequences in public databases
    • bat053, 3713709, 23864220
    • Nagy A, Patthy L. MisPred: a resource for identification of erroneous protein sequences in public databases. Database 2013, 2013:bat053. 3713709, 23864220.
    • (2013) Database , vol.2013
    • Nagy, A.1    Patthy, L.2
  • 43
    • 84875368659 scopus 로고    scopus 로고
    • The 1KP Project
    • The 1KP Project. http://onekp.com/project.html.
  • 44
    • 84904744730 scopus 로고    scopus 로고
    • Alignment Program
    • Alignment Program. http://www.genome.ist.i.kyoto-u.ac.jp/~aln_user/.
  • 45
    • 84904754784 scopus 로고    scopus 로고
    • UniGene
    • UniGene. http://www.ncbi.nlm.nih.gov/unigene.
  • 46
    • 84904740792 scopus 로고    scopus 로고
    • GenBank
    • GenBank. http://www.ncbi.nlm.nih.gov/genbank.
  • 47
    • 0347432399 scopus 로고    scopus 로고
    • PlantGDB, plant genome database and analysis tools
    • 10.1093/nar/gkh046, 308780, 14681433
    • Dong Q, Schlueter SD, Brendel V. PlantGDB, plant genome database and analysis tools. Nucleic Acids Res 2004, 32:D354-D359. 10.1093/nar/gkh046, 308780, 14681433.
    • (2004) Nucleic Acids Res , vol.32
    • Dong, Q.1    Schlueter, S.D.2    Brendel, V.3
  • 48
    • 43349092271 scopus 로고    scopus 로고
    • A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence
    • 10.1093/nar/gkn105, 2377433, 18344523
    • Gotoh O. A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence. Nucleic Acids Res 2008, 36:2630-2638. 10.1093/nar/gkn105, 2377433, 18344523.
    • (2008) Nucleic Acids Res , vol.36 , pp. 2630-2638
    • Gotoh, O.1
  • 49
    • 0026691182 scopus 로고
    • The rapid generation of mutation data matrices from protein sequences
    • Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comp Appl Biosci 1992, 8:275-282.
    • (1992) Comp Appl Biosci , vol.8 , pp. 275-282
    • Jones, D.T.1    Taylor, W.R.2    Thornton, J.M.3
  • 50
    • 0000228203 scopus 로고
    • A model of evolutionary change in proteins
    • Silver Spring, ML: National Biomedical Research Foundation, Dayhoff MO, 5
    • Dayhoff MO, Schwartz RM, Orcutt BC. A model of evolutionary change in proteins. Atlas of protein sequence and structure, Volume 3 1978, 345-352. Silver Spring, ML: National Biomedical Research Foundation, Dayhoff MO, 5.
    • (1978) Atlas of protein sequence and structure, Volume 3 , pp. 345-352
    • Dayhoff, M.O.1    Schwartz, R.M.2    Orcutt, B.C.3
  • 52
    • 0027210418 scopus 로고
    • Optimal alignment between groups of sequences and its application to multiple sequence alignment
    • Gotoh O. Optimal alignment between groups of sequences and its application to multiple sequence alignment. Comp Appl Biosci 1993, 9:361-370.
    • (1993) Comp Appl Biosci , vol.9 , pp. 361-370
    • Gotoh, O.1
  • 53
    • 33846581070 scopus 로고    scopus 로고
    • Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost
    • 10.1186/1471-2105-7-524, 1769516, 17137519
    • Yamada S, Gotoh O, Yamana H. Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost. BMC Bioinformatics 2006, 7:524. 10.1186/1471-2105-7-524, 1769516, 17137519.
    • (2006) BMC Bioinformatics , vol.7 , pp. 524
    • Yamada, S.1    Gotoh, O.2    Yamana, H.3
  • 54
    • 78651503221 scopus 로고    scopus 로고
    • Comparative analysis of information contents relevant to recognition of introns in many species
    • 10.1186/1471-2164-12-45, 3033335, 21247441
    • Iwata H, Gotoh O. Comparative analysis of information contents relevant to recognition of introns in many species. BMC Genomics 2011, 12:45. 10.1186/1471-2164-12-45, 3033335, 21247441.
    • (2011) BMC Genomics , vol.12 , pp. 45
    • Iwata, H.1    Gotoh, O.2
  • 55
    • 84891820737 scopus 로고    scopus 로고
    • Improvement in Speed and Accuracy of Multiple Sequence Alignment Program PRIME
    • Yamada S, Gotoh O, Yamana H. Improvement in Speed and Accuracy of Multiple Sequence Alignment Program PRIME. Inform Media Tech 2009, 4:317-327.
    • (2009) Inform Media Tech , vol.4 , pp. 317-327
    • Yamada, S.1    Gotoh, O.2    Yamana, H.3
  • 56
    • 33746173456 scopus 로고    scopus 로고
    • Critical values for six Dixon tests for outliers in normal samples up to sizes 100, and applications in science and engineering
    • Verma SP, Quiroz-Ruiz A. Critical values for six Dixon tests for outliers in normal samples up to sizes 100, and applications in science and engineering. Revista Mexicana de Ciencias Geológicas 2006, 23:133-161.
    • (2006) Revista Mexicana de Ciencias Geológicas , vol.23 , pp. 133-161
    • Verma, S.P.1    Quiroz-Ruiz, A.2
  • 57
    • 0024569996 scopus 로고
    • Secondary structure prediction of 52 membrane-bound cytochromes P450 shows a strong structural similarity to P450cam
    • 10.1021/bi00428a036, 2713336
    • Nelson DR, Strobel HW. Secondary structure prediction of 52 membrane-bound cytochromes P450 shows a strong structural similarity to P450cam. Biochemistry 1989, 28:656-660. 10.1021/bi00428a036, 2713336.
    • (1989) Biochemistry , vol.28 , pp. 656-660
    • Nelson, D.R.1    Strobel, H.W.2
  • 58
    • 0026542989 scopus 로고
    • Substrate recognition sites in cytochrome P450 family 2 (CYP2) proteins inferred from comparative analyses of amino acid and coding nucleotide sequences
    • Gotoh O. Substrate recognition sites in cytochrome P450 family 2 (CYP2) proteins inferred from comparative analyses of amino acid and coding nucleotide sequences. J Biol Chem 1992, 267:83-90.
    • (1992) J Biol Chem , vol.267 , pp. 83-90
    • Gotoh, O.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.