메뉴 건너뛰기




Volumn 13, Issue 1, 2012, Pages

Short-read reading-frame predictors are not created equal: Sequence error causes loss of signal

Author keywords

Gene callers, Ab initio gene prediction; Gene prediction; Reading frames; Sequence errors; Short reads

Indexed keywords

AB INITIO; ANNOTATION SYSTEMS; ARTIFICIAL DATA; CODING REGION; COMPUTATIONAL BURDEN; CONTIGS; GENE PREDICTION; HIDDEN MARKOV; LOSS-OF-SIGNAL; MOST LIKELY; READING FRAMES; SEQUENCE ANALYSIS; SEQUENCING ERRORS; SHORT READS;

EID: 84864201343     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/1471-2105-13-183     Document Type: Article
Times cited : (34)

References (49)
  • 1
    • 53649106195 scopus 로고    scopus 로고
    • Next-generation DNA sequencing
    • 10.1038/nbt1486, 18846087
    • Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol 2008, 26(10):1135-1145. 10.1038/nbt1486, 18846087.
    • (2008) Nat Biotechnol , vol.26 , Issue.10 , pp. 1135-1145
    • Shendure, J.1    Ji, H.2
  • 3
    • 70449698364 scopus 로고    scopus 로고
    • Next-generation gap
    • 10.1038/nmeth.f.268, 19844227
    • McPherson J. Next-generation gap. Nat Methods 2009, 6(11s):S2-S5. 10.1038/nmeth.f.268, 19844227.
    • (2009) Nat Methods , vol.6 , Issue.11 S
    • McPherson, J.1
  • 4
    • 79955877212 scopus 로고    scopus 로고
    • RAPSearch: a fast protein similarity search tool for short reads
    • 10.1186/1471-2105-12-159, 3113943, 21575167
    • Ye Y, Choi J-H, Tang H. RAPSearch: a fast protein similarity search tool for short reads. BMC Bioinformatics 2011, 12(1):159. 10.1186/1471-2105-12-159, 3113943, 21575167.
    • (2011) BMC Bioinformatics , vol.12 , Issue.1 , pp. 159
    • Ye, Y.1    Choi, J.-H.2    Tang, H.3
  • 5
    • 80052121849 scopus 로고    scopus 로고
    • CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing
    • 10.1186/1471-2105-12-356, 3228541, 21878105
    • Angiuoli S, Matalka M, Gussman A, Galens K, Vangala M, Riley D, Arze C, White J, White O, Fricke F. CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinformatics 2011, 12(1):356. 10.1186/1471-2105-12-356, 3228541, 21878105.
    • (2011) BMC Bioinformatics , vol.12 , Issue.1 , pp. 356
    • Angiuoli, S.1    Matalka, M.2    Gussman, A.3    Galens, K.4    Vangala, M.5    Riley, D.6    Arze, C.7    White, J.8    White, O.9    Fricke, F.10
  • 6
    • 0032518163 scopus 로고    scopus 로고
    • Microbial gene identification using interpolated Markov models
    • 10.1093/nar/26.2.544, 147303, 9421513
    • Salzberg SL, Delcher AL, Kasif S, White O. Microbial gene identification using interpolated Markov models. Nucleic Acids Res 1998, 26(2):544-548. 10.1093/nar/26.2.544, 147303, 9421513.
    • (1998) Nucleic Acids Res , vol.26 , Issue.2 , pp. 544-548
    • Salzberg, S.L.1    Delcher, A.L.2    Kasif, S.3    White, O.4
  • 7
    • 0033214628 scopus 로고    scopus 로고
    • Heuristic approach to deriving models for gene finding
    • 10.1093/nar/27.19.3911, 148655, 10481031
    • Besemer J, Borodovsky M. Heuristic approach to deriving models for gene finding. Nucleic Acids Res 1999, 27(19):3911-3920. 10.1093/nar/27.19.3911, 148655, 10481031.
    • (1999) Nucleic Acids Res , vol.27 , Issue.19 , pp. 3911-3920
    • Besemer, J.1    Borodovsky, M.2
  • 8
    • 77955902981 scopus 로고    scopus 로고
    • Ab initio gene identification in metagenomic sequences
    • 10.1093/nar/gkq275, 2896542, 20403810
    • Zhu W, Lomsadze A, Borodovsky M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res 2010, 38(12):e132-e132. 10.1093/nar/gkq275, 2896542, 20403810.
    • (2010) Nucleic Acids Res , vol.38 , Issue.12
    • Zhu, W.1    Lomsadze, A.2    Borodovsky, M.3
  • 9
    • 59149090570 scopus 로고    scopus 로고
    • MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes
    • 10.1093/dnares/dsn027, 2608843, 18940874
    • Noguchi H, Taniguchi T, Itoh T. MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes. DNA Res 2008, 15(6):387-396. 10.1093/dnares/dsn027, 2608843, 18940874.
    • (2008) DNA Res , vol.15 , Issue.6 , pp. 387-396
    • Noguchi, H.1    Taniguchi, T.2    Itoh, T.3
  • 10
    • 67849095415 scopus 로고    scopus 로고
    • Predicting genes in metagenomic sequencing reads
    • 2703946, 19429689, Web Server issue
    • Hoff K, Lingner T, Meinicke P, Tech M O. Predicting genes in metagenomic sequencing reads. Nucleic Acids Res 2009, 37(Web Server issue):W101-105. 2703946, 19429689.
    • (2009) Nucleic Acids Res , vol.37
    • Hoff, K.1    Lingner, T.2    Meinicke, P.3    Tech, M.O.4
  • 11
    • 78651326786 scopus 로고    scopus 로고
    • FragGeneScan: predicting genes in short and error-prone reads
    • 10.1093/nar/gkq747, 2978382, 20805240
    • Rho M, Tang H, Ye Y. FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 2010, 38(20):e191-e191. 10.1093/nar/gkq747, 2978382, 20805240.
    • (2010) Nucleic Acids Res , vol.38 , Issue.20
    • Rho, M.1    Tang, H.2    Ye, Y.3
  • 12
    • 77952299957 scopus 로고    scopus 로고
    • Prodigal: prokaryotic gene recognition and translation initiation site identification
    • 10.1186/1471-2105-11-119, 2848648, 20211023
    • Hyatt D, Chen G, LoCascio P, Land M, Larimer F, Hauser L. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010, 11(1):119. 10.1186/1471-2105-11-119, 2848648, 20211023.
    • (2010) BMC Bioinformatics , vol.11 , Issue.1 , pp. 119
    • Hyatt, D.1    Chen, G.2    LoCascio, P.3    Land, M.4    Larimer, F.5    Hauser, L.6
  • 13
    • 53549118607 scopus 로고    scopus 로고
    • The Metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes
    • 10.1186/1471-2105-9-386, 2563014, 18803844
    • Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM, Kubal M, Paczian T, Rodriguez A, Stevens R, Wilke A, et al. The Metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 2008, 9(1):386-388. 10.1186/1471-2105-9-386, 2563014, 18803844.
    • (2008) BMC Bioinformatics , vol.9 , Issue.1 , pp. 386-388
    • Meyer, F.1    Paarmann, D.2    D'Souza, M.3    Olson, R.4    Glass, E.M.5    Kubal, M.6    Paczian, T.7    Rodriguez, A.8    Stevens, R.9    Wilke, A.10
  • 14
    • 84862503319 scopus 로고    scopus 로고
    • The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools
    • 10.1186/1471-2105-13-141, 3410781, 22720753
    • Wilke A, Harrison T, Wilkening J, Field D, Glass EM, Kyrpides N, Mavrommatis K, Meyer F. The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools. BMC Bioinformatics 2012, 13:141. 10.1186/1471-2105-13-141, 3410781, 22720753.
    • (2012) BMC Bioinformatics , vol.13 , pp. 141
    • Wilke, A.1    Harrison, T.2    Wilkening, J.3    Field, D.4    Glass, E.M.5    Kyrpides, N.6    Mavrommatis, K.7    Meyer, F.8
  • 15
    • 33947215475 scopus 로고    scopus 로고
    • CAMERA: a community resource for metagenomics
    • 10.1371/journal.pbio.0050075, 1821059, 17355175
    • Seshadri R, Kravitz S, Smarr L, Gilna P, Frazier M. CAMERA: a community resource for metagenomics. PLoS Biol 2007, 5(3):e75. 10.1371/journal.pbio.0050075, 1821059, 17355175.
    • (2007) PLoS Biol , vol.5 , Issue.3
    • Seshadri, R.1    Kravitz, S.2    Smarr, L.3    Gilna, P.4    Frazier, M.5
  • 16
    • 79956366635 scopus 로고    scopus 로고
    • The JCVI standard operating procedure for annotating prokaryotic metagenomic shotgun sequencing data
    • 10.4056/sigs.651139, 3035284, 21304707
    • Tanenbaum D, Goll J, Murphy S, Kumar P, Zafar N, Thiagarajan M, Madupu R, Davidsen T, Kagan L, Kravitz S, et al. The JCVI standard operating procedure for annotating prokaryotic metagenomic shotgun sequencing data. Stand Genomic Sci 2010, 2(2):229-237. 10.4056/sigs.651139, 3035284, 21304707.
    • (2010) Stand Genomic Sci , vol.2 , Issue.2 , pp. 229-237
    • Tanenbaum, D.1    Goll, J.2    Murphy, S.3    Kumar, P.4    Zafar, N.5    Thiagarajan, M.6    Madupu, R.7    Davidsen, T.8    Kagan, L.9    Kravitz, S.10
  • 17
    • 78449297680 scopus 로고    scopus 로고
    • SmashCommunity: a metagenomic annotation and analysis tool
    • 10.1093/bioinformatics/btq536, 20959381
    • Arumugam M, Harrington E, Foerstner K, Raes J, Bork P. SmashCommunity: a metagenomic annotation and analysis tool. Bioinformatics 2010, 26(23):2977-2978. 10.1093/bioinformatics/btq536, 20959381.
    • (2010) Bioinformatics , vol.26 , Issue.23 , pp. 2977-2978
    • Arumugam, M.1    Harrington, E.2    Foerstner, K.3    Raes, J.4    Bork, P.5
  • 18
    • 79960030549 scopus 로고    scopus 로고
    • CoMet--a web server for comparative functional profiling of metagenomes
    • 3125781, 21622656, Web Server issue
    • Lingner T, Aßhauer K, Schreiber F, Meinicke P. CoMet--a web server for comparative functional profiling of metagenomes. Nucleic Acids Res 2011, 39(Web Server issue):W518-W523. 3125781, 21622656.
    • (2011) Nucleic Acids Res , vol.39
    • Lingner, T.1    Aßhauer, K.2    Schreiber, F.3    Meinicke, P.4
  • 22
    • 0036226603 scopus 로고    scopus 로고
    • BLAT-the BLAST-like alignment tool
    • 187518, 11932250
    • Kent J. BLAT-the BLAST-like alignment tool. Genome Res 2002, 12(4):656-664. 187518, 11932250.
    • (2002) Genome Res , vol.12 , Issue.4 , pp. 656-664
    • Kent, J.1
  • 23
    • 34147132825 scopus 로고    scopus 로고
    • Identifying bacterial genes and endosymbiont DNA with Glimmer
    • 10.1093/bioinformatics/btm009, 2387122, 17237039
    • Delcher A, Bratke K, Powers E, Salzberg S. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 2007, 23(6):673-679. 10.1093/bioinformatics/btm009, 2387122, 17237039.
    • (2007) Bioinformatics , vol.23 , Issue.6 , pp. 673-679
    • Delcher, A.1    Bratke, K.2    Powers, E.3    Salzberg, S.4
  • 25
    • 34948842866 scopus 로고    scopus 로고
    • Accuracy and quality of massively parallel DNA pyrosequencing
    • 10.1186/gb-2007-8-7-r143, 2323236, 17659080
    • Huse S, Huber J, Morrison H, Sogin M, Welch D. Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol 2007, 8(7):R143. 10.1186/gb-2007-8-7-r143, 2323236, 17659080.
    • (2007) Genome Biol , vol.8 , Issue.7
    • Huse, S.1    Huber, J.2    Morrison, H.3    Sogin, M.4    Welch, D.5
  • 26
    • 77950645212 scopus 로고    scopus 로고
    • Artificial and natural duplicates in pyrosequencing reads of metagenomic data
    • Niu B, Fu L, Sun S, Li W. Artificial and natural duplicates in pyrosequencing reads of metagenomic data. BMC Bioinforma 2010, 11(1):187.
    • (2010) BMC Bioinforma , vol.11 , Issue.1 , pp. 187
    • Niu, B.1    Fu, L.2    Sun, S.3    Li, W.4
  • 27
    • 77956837921 scopus 로고    scopus 로고
    • Model-based quality assessment and base-calling for second-generation sequencing data
    • 10.1111/j.1541-0420.2009.01353.x, 2888717, 19912177
    • Bravo H, Irizarry R. Model-based quality assessment and base-calling for second-generation sequencing data. Biometrics 2010, 66(3):665-674. 10.1111/j.1541-0420.2009.01353.x, 2888717, 19912177.
    • (2010) Biometrics , vol.66 , Issue.3 , pp. 665-674
    • Bravo, H.1    Irizarry, R.2
  • 28
    • 79956044149 scopus 로고    scopus 로고
    • Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing
    • 10.1186/1471-2164-12-245, 3116506, 21592414
    • Gilles A, Meglécz E, Pech N, Ferreira S, Malausa T, Martin J-F. Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing. BMC Genomics 2011, 12(1):245. 10.1186/1471-2164-12-245, 3116506, 21592414.
    • (2011) BMC Genomics , vol.12 , Issue.1 , pp. 245
    • Gilles, A.1    Meglécz, E.2    Pech, N.3    Ferreira, S.4    Malausa, T.5    Martin, J.-F.6
  • 29
    • 84864044436 scopus 로고    scopus 로고
    • A platform-independent method for detecting errors in metagenomic sequencing data: drisee
    • 10.1371/journal.pcbi.1002541, 3369934, 22685393
    • Keegan K, Trimble W, Wilkening J, Wilke A, Harrison T, D'Souza M, Meyer F. A platform-independent method for detecting errors in metagenomic sequencing data: drisee. PLoS Comput Biol 2012, 8(6):e1002541. 10.1371/journal.pcbi.1002541, 3369934, 22685393.
    • (2012) PLoS Comput Biol , vol.8 , Issue.6
    • Keegan, K.1    Trimble, W.2    Wilkening, J.3    Wilke, A.4    Harrison, T.5    D'Souza, M.6    Meyer, F.7
  • 30
    • 33750976398 scopus 로고    scopus 로고
    • MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
    • 10.1093/nar/gkl723, 1636498, 17028096
    • Noguchi H, Park J, Takagi T. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res 2006, 34(19):5623-5630. 10.1093/nar/gkl723, 1636498, 17028096.
    • (2006) Nucleic Acids Res , vol.34 , Issue.19 , pp. 5623-5630
    • Noguchi, H.1    Park, J.2    Takagi, T.3
  • 31
    • 74049094093 scopus 로고    scopus 로고
    • The effect of sequencing errors on metagenomic gene prediction
    • 10.1186/1471-2164-10-520, 2781827, 19909532
    • Hoff K. The effect of sequencing errors on metagenomic gene prediction. BMC Genomics 2009, 10(1):520. 10.1186/1471-2164-10-520, 2781827, 19909532.
    • (2009) BMC Genomics , vol.10 , Issue.1 , pp. 520
    • Hoff, K.1
  • 32
    • 0027305701 scopus 로고
    • Symmetry observations in long nucleotide sequences
    • 10.1093/nar/21.12.2797, 309655, 8332488
    • Prabhu VV. Symmetry observations in long nucleotide sequences. Nucleic Acids Res 1993, 21(12):2797-2800. 10.1093/nar/21.12.2797, 309655, 8332488.
    • (1993) Nucleic Acids Res , vol.21 , Issue.12 , pp. 2797-2800
    • Prabhu, V.V.1
  • 33
    • 0028828990 scopus 로고
    • Relative roles of primary sequence and (G + C)% in determining the hierarchy of frequencies of complementary trinucleotide pairs in DNAs of different species
    • Forsdyke DR. Relative roles of primary sequence and (G + C)% in determining the hierarchy of frequencies of complementary trinucleotide pairs in DNAs of different species. J Mol Evol 1995, 41(5):573-581.
    • (1995) J Mol Evol , vol.41 , Issue.5 , pp. 573-581
    • Forsdyke, D.R.1
  • 34
    • 1842725797 scopus 로고
    • Source and receiver behavior in the use of a criterion
    • Egan J, Clarke F. Source and receiver behavior in the use of a criterion. J Acoust Soc Am 1956, 28(6):1267-1269.
    • (1956) J Acoust Soc Am , vol.28 , Issue.6 , pp. 1267-1269
    • Egan, J.1    Clarke, F.2
  • 35
    • 77953676254 scopus 로고    scopus 로고
    • Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm
    • 10.1142/S0219720010004847, 20556861
    • Antonov I, Borodovsky M. Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm. J Bioinform Comput Biol 2010, 8(3):535-551. 10.1142/S0219720010004847, 20556861.
    • (2010) J Bioinform Comput Biol , vol.8 , Issue.3 , pp. 535-551
    • Antonov, I.1    Borodovsky, M.2
  • 37
    • 79251587455 scopus 로고    scopus 로고
    • Metagenomic discovery of biomass-degrading genes and genomes from cow rumen
    • 10.1126/science.1200387, 21273488
    • Hess M, Sczyrba A, Egan R, Kim T-W, Chokhawala H, Schroth G, Luo S, Clark D, Chen F, Zhang T, et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 2011, 331(6016):463-467. 10.1126/science.1200387, 21273488.
    • (2011) Science , vol.331 , Issue.6016 , pp. 463-467
    • Hess, M.1    Sczyrba, A.2    Egan, R.3    Kim, T.-W.4    Chokhawala, H.5    Schroth, G.6    Luo, S.7    Clark, D.8    Chen, F.9    Zhang, T.10
  • 39
    • 78651266519 scopus 로고    scopus 로고
    • Combining gene prediction methods to improve metagenomic gene annotation
    • 10.1186/1471-2105-12-20, 3042383, 21232129
    • Yok N, Rosen G. Combining gene prediction methods to improve metagenomic gene annotation. BMC Bioinformatics 2011, 12(1):20. 10.1186/1471-2105-12-20, 3042383, 21232129.
    • (2011) BMC Bioinformatics , vol.12 , Issue.1 , pp. 20
    • Yok, N.1    Rosen, G.2
  • 40
    • 1542327668 scopus 로고    scopus 로고
    • Trends between gene content and genome size in prokaryotic species with larger genomes
    • 10.1073/pnas.0308653100, 365760, 14973198
    • Konstantinidis K, Tiedje J. Trends between gene content and genome size in prokaryotic species with larger genomes. Proc Natl Acad Sci USA 2004, 101(9):3160-3165. 10.1073/pnas.0308653100, 365760, 14973198.
    • (2004) Proc Natl Acad Sci USA , vol.101 , Issue.9 , pp. 3160-3165
    • Konstantinidis, K.1    Tiedje, J.2
  • 44
    • 84865102235 scopus 로고    scopus 로고
    • Measuring the microbiome: perspectives on advances in DNA-based techniques for exploring microbial life
    • 10.1093/bib/bbr080, 3404397, 22308073
    • Foster J, Bunge J, Gilbert J, Moore J. Measuring the microbiome: perspectives on advances in DNA-based techniques for exploring microbial life. Brief Bioinform 2012, 13(4):420-429. 10.1093/bib/bbr080, 3404397, 22308073.
    • (2012) Brief Bioinform , vol.13 , Issue.4 , pp. 420-429
    • Foster, J.1    Bunge, J.2    Gilbert, J.3    Moore, J.4
  • 45
    • 40549141418 scopus 로고    scopus 로고
    • Metagenomics: read length matters
    • 10.1128/AEM.02181-07, 2258652, 18192407
    • Wommack E, Bhavsar J, Ravel J. Metagenomics: read length matters. Appl Environ Microbiol 2008, 74(5):1453-1463. 10.1128/AEM.02181-07, 2258652, 18192407.
    • (2008) Appl Environ Microbiol , vol.74 , Issue.5 , pp. 1453-1463
    • Wommack, E.1    Bhavsar, J.2    Ravel, J.3
  • 46
    • 13444306641 scopus 로고    scopus 로고
    • NCBI Reference Sequence RefSeq: a curated non-redundant sequence database of genomes, transcripts and proteins
    • 539979, 15608248
    • Pruitt K, Tatusova T, Maglott D. NCBI Reference Sequence RefSeq: a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2005, 33(suppl 1):D501-D504. 539979, 15608248.
    • (2005) Nucleic Acids Res , vol.33 , Issue.SUPPL. 1
    • Pruitt, K.1    Tatusova, T.2    Maglott, D.3
  • 47
    • 54949137701 scopus 로고    scopus 로고
    • MetaSim-a sequencing simulator for genomics and metagenomics
    • 10.1371/journal.pone.0003373, 2556396, 18841204
    • Richter D, Ott F, Auch A, Schmid R, Huson D. MetaSim-a sequencing simulator for genomics and metagenomics. PLoS One 2008, 3(10):e3373. 10.1371/journal.pone.0003373, 2556396, 18841204.
    • (2008) PLoS One , vol.3 , Issue.10
    • Richter, D.1    Ott, F.2    Auch, A.3    Schmid, R.4    Huson, D.5
  • 48
    • 0001585584 scopus 로고
    • Operating characteristics determined by binary decisions and by ratings
    • Egan J, Schulman A, Greenberg G. Operating characteristics determined by binary decisions and by ratings. J Acoust Soc Am 1959, 31(6):768-773.
    • (1959) J Acoust Soc Am , vol.31 , Issue.6 , pp. 768-773
    • Egan, J.1    Schulman, A.2    Greenberg, G.3
  • 49
    • 44649090674 scopus 로고    scopus 로고
    • Gene prediction in metagenomic fragments: a large scale machine learning approach
    • 10.1186/1471-2105-9-217, 2409338, 18442389
    • Hoff K, Tech M, Lingner T, Daniel R, Morgenstern B, Meinicke P. Gene prediction in metagenomic fragments: a large scale machine learning approach. BMC Bioinformatics 2008, 9(1):217. 10.1186/1471-2105-9-217, 2409338, 18442389.
    • (2008) BMC Bioinformatics , vol.9 , Issue.1 , pp. 217
    • Hoff, K.1    Tech, M.2    Lingner, T.3    Daniel, R.4    Morgenstern, B.5    Meinicke, P.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.