메뉴 건너뛰기




Volumn 13, Issue 1, 2012, Pages

Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource

Author keywords

[No Author keywords available]

Indexed keywords

BIOLOGICAL DATA; COMPUTATIONAL RESOURCES; FULLY AUTOMATED; GENOME SEQUENCING; GENOME SEQUENCING PROJECTS; GLOBAL ALIGNMENT; HIGH-THROUGHPUT; HOMOLOGOUS PROTEINS; IDENTIFICATION OF PROTEINS; ITERATIVE CLUSTERING; METAGENOMES; MULTIPLE SEQUENCE ALIGNMENTS; NEW APPROACHES; NOVEL PROTEINS; PHYLOGENETIC TREES; PROTEIN FAMILY; PROTEIN SEQUENCES; QUALITY METRICS; RAPID IDENTIFICATION; SOURCE CODES;

EID: 84867294452     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/1471-2105-13-264     Document Type: Article
Times cited : (16)

References (32)
  • 1
    • 57149085868 scopus 로고    scopus 로고
    • Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world
    • 10.1093/nar/gkn668, 2588523, 18948295
    • Koonin EV, Wolf YI. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res 2008, 36(21):6688-6719. 10.1093/nar/gkn668, 2588523, 18948295.
    • (2008) Nucleic Acids Res , vol.36 , Issue.21 , pp. 6688-6719
    • Koonin, E.V.1    Wolf, Y.I.2
  • 2
    • 0035945586 scopus 로고    scopus 로고
    • Genome sequence of enterohaemorrhagic Escherichia coli O157:H7
    • 10.1038/35054089, 11206551
    • Perna NT, et al. Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 2001, 409(6819):529-533. 10.1038/35054089, 11206551.
    • (2001) Nature , vol.409 , Issue.6819 , pp. 529-533
    • Perna, N.T.1
  • 3
    • 25444524604 scopus 로고    scopus 로고
    • Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial " pan-genome"
    • 10.1073/pnas.0506758102, 1216834, 16172379
    • Tettelin H, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial " pan-genome" . Proc Natl Acad Sci USA 2005, 102(39):13950-13955. 10.1073/pnas.0506758102, 1216834, 16172379.
    • (2005) Proc Natl Acad Sci USA , vol.102 , Issue.39 , pp. 13950-13955
    • Tettelin, H.1
  • 4
    • 53849104729 scopus 로고    scopus 로고
    • The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates
    • 10.1128/JB.00619-08, 2566221, 18676672
    • Rasko DA, et al. The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates. J Bacteriol 2008, 190(20):6881-6893. 10.1128/JB.00619-08, 2566221, 18676672.
    • (2008) J Bacteriol , vol.190 , Issue.20 , pp. 6881-6893
    • Rasko, D.A.1
  • 5
    • 72949101201 scopus 로고    scopus 로고
    • A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
    • 10.1038/nature08656, 3073058, 20033048
    • Wu D, et al. A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea. Nature 2009, 462(7276):1056-1060. 10.1038/nature08656, 3073058, 20033048.
    • (2009) Nature , vol.462 , Issue.7276 , pp. 1056-1060
    • Wu, D.1
  • 6
    • 33947235074 scopus 로고    scopus 로고
    • The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families
    • 10.1371/journal.pbio.0050016, 1821046, 17355171
    • Yooseph S, et al. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol 2007, 5(3):e16. 10.1371/journal.pbio.0050016, 1821046, 17355171.
    • (2007) PLoS Biol , vol.5 , Issue.3
    • Yooseph, S.1
  • 7
    • 0030660581 scopus 로고    scopus 로고
    • A genomic perspective on protein families
    • 10.1126/science.278.5338.631, 9381173
    • Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science 1997, 278(5338):631-637. 10.1126/science.278.5338.631, 9381173.
    • (1997) Science , vol.278 , Issue.5338 , pp. 631-637
    • Tatusov, R.L.1    Koonin, E.V.2    Lipman, D.J.3
  • 8
    • 38549097071 scopus 로고    scopus 로고
    • The universal protein resource (UniProt)
    • 2238893, 18045787
    • Consortium TU. The universal protein resource (UniProt). Nucleic Acids Res 2008, 36(Database issue):D190-5. 2238893, 18045787.
    • (2008) Nucleic Acids Res , vol.36 , Issue.DATABASE ISSUE
    • Consortium, T.U.1
  • 9
    • 0033982936 scopus 로고    scopus 로고
    • KEGG: kyoto encyclopedia of genes and genomes
    • 10.1093/nar/28.1.27, 102409, 10592173
    • Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000, 28(1):27-30. 10.1093/nar/28.1.27, 102409, 10592173.
    • (2000) Nucleic Acids Res , vol.28 , Issue.1 , pp. 27-30
    • Kanehisa, M.1    Goto, S.2
  • 10
    • 58149202167 scopus 로고    scopus 로고
    • HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot
    • 2686602, 18849571
    • Lima T, et al. HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot. Nucleic Acids Res 2009, 37(Database issue):D471-D478. 2686602, 18849571.
    • (2009) Nucleic Acids Res , vol.37 , Issue.DATABASE ISSUE
    • Lima, T.1
  • 11
    • 73349120997 scopus 로고    scopus 로고
    • FIGfams: yet another set of protein families
    • 10.1093/nar/gkp698, 2777423, 19762480
    • Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res 2009, 37(20):6643-6654. 10.1093/nar/gkp698, 2777423, 19762480.
    • (2009) Nucleic Acids Res , vol.37 , Issue.20 , pp. 6643-6654
    • Meyer, F.1    Overbeek, R.2    Rodriguez, A.3
  • 12
    • 84858077472 scopus 로고    scopus 로고
    • The Pfam protein families database
    • 3245129, 22127870
    • Punta M, et al. The Pfam protein families database. Nucleic Acids Res 2012, 40(Database issue):D290-D301. 3245129, 22127870.
    • (2012) Nucleic Acids Res , vol.40 , Issue.DATABASE ISSUE
    • Punta, M.1
  • 13
    • 0037249633 scopus 로고    scopus 로고
    • The TIGRFAMs database of protein families
    • 10.1093/nar/gkg128, 165575, 12520025
    • Haft DH, Selengut JD, White O. The TIGRFAMs database of protein families. Nucleic Acids Res 2003, 31(1):371-373. 10.1093/nar/gkg128, 165575, 12520025.
    • (2003) Nucleic Acids Res , vol.31 , Issue.1 , pp. 371-373
    • Haft, D.H.1    Selengut, J.D.2    White, O.3
  • 14
    • 0037249501 scopus 로고    scopus 로고
    • PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification
    • 10.1093/nar/gkg115, 165562, 12520017
    • Thomas PD, et al. PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification. Nucleic Acids Res 2003, 31(1):334-341. 10.1093/nar/gkg115, 165562, 12520017.
    • (2003) Nucleic Acids Res , vol.31 , Issue.1 , pp. 334-341
    • Thomas, P.D.1
  • 15
    • 33750474439 scopus 로고    scopus 로고
    • PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification
    • 10.1186/gb-2006-7-9-r83, 1794543, 16973001
    • Krishnamurthy N, et al. PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification. Genome Biol 2006, 7(9):R83. 10.1186/gb-2006-7-9-r83, 1794543, 16973001.
    • (2006) Genome Biol , vol.7 , Issue.9
    • Krishnamurthy, N.1
  • 16
    • 58149186488 scopus 로고    scopus 로고
    • The National Center for Biotechnology Information's Protein Clusters Database
    • 2686591, 18940865
    • Klimke W, et al. The National Center for Biotechnology Information's Protein Clusters Database. Nucleic Acids Res 2009, 37(Database issue):D216-D223. 2686591, 18940865.
    • (2009) Nucleic Acids Res , vol.37 , Issue.DATABASE ISSUE
    • Klimke, W.1
  • 17
    • 84861203603 scopus 로고    scopus 로고
    • EggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges
    • 3245133, 22096231
    • Powell S, et al. eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res 2012, 40(Database issue):D284-D289. 3245133, 22096231.
    • (2012) Nucleic Acids Res , vol.40 , Issue.DATABASE ISSUE
    • Powell, S.1
  • 18
    • 33748776559 scopus 로고    scopus 로고
    • Automated protein function prediction-the genomic challenge
    • 10.1093/bib/bbl004, 16772267
    • Friedberg I. Automated protein function prediction-the genomic challenge. Brief Bioinform 2006, 7(3):225-242. 10.1093/bib/bbl004, 16772267.
    • (2006) Brief Bioinform , vol.7 , Issue.3 , pp. 225-242
    • Friedberg, I.1
  • 19
    • 70349645483 scopus 로고    scopus 로고
    • Comparative genomic analyses of the human fungal pathogens Coccidioides and their relatives
    • 10.1101/gr.087551.108, 2765278, 19717792
    • Sharpton TJ, et al. Comparative genomic analyses of the human fungal pathogens Coccidioides and their relatives. Genome Res 2009, 19(10):1722-1731. 10.1101/gr.087551.108, 2765278, 19717792.
    • (2009) Genome Res , vol.19 , Issue.10 , pp. 1722-1731
    • Sharpton, T.J.1
  • 20
    • 78149423315 scopus 로고    scopus 로고
    • Metagenomes from high-temperature chemotrophic systems reveal geochemical controls on microbial community structure and function
    • 10.1371/journal.pone.0009773, 2841643, 20333304
    • Inskeep WP, et al. Metagenomes from high-temperature chemotrophic systems reveal geochemical controls on microbial community structure and function. PLoS One 2010, 5(3):e9773. 10.1371/journal.pone.0009773, 2841643, 20333304.
    • (2010) PLoS One , vol.5 , Issue.3
    • Inskeep, W.P.1
  • 21
    • 0036529479 scopus 로고    scopus 로고
    • An efficient algorithm for large-scale detection of protein families
    • 10.1093/nar/30.7.1575, 101833, 11917018
    • Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 2002, 30(7):1575-1584. 10.1093/nar/30.7.1575, 101833, 11917018.
    • (2002) Nucleic Acids Res , vol.30 , Issue.7 , pp. 1575-1584
    • Enright, A.J.1    Van Dongen, S.2    Ouzounis, C.A.3
  • 22
    • 80053194704 scopus 로고    scopus 로고
    • Multiple sequence alignment: a major challenge to large-scale phylogenetics
    • 2989897,2989897,2989897, 21113338
    • Liu K, Linder CR, Warnow T. Multiple sequence alignment: a major challenge to large-scale phylogenetics. PLoS Curr 2010, 2:RRN1198. 2989897,2989897,2989897, 21113338.
    • (2010) PLoS Curr , vol.2
    • Liu, K.1    Linder, C.R.2    Warnow, T.3
  • 23
    • 79952116426 scopus 로고    scopus 로고
    • InterPro protein classification
    • 10.1007/978-1-60761-977-2_3, 21082426
    • McDowall J, Hunter S. InterPro protein classification. Methods Mol Biol 2011, 694:37-47. 10.1007/978-1-60761-977-2_3, 21082426.
    • (2011) Methods Mol Biol , vol.694 , pp. 37-47
    • McDowall, J.1    Hunter, S.2
  • 24
    • 0033119399 scopus 로고    scopus 로고
    • Errors in genome annotation
    • 10.1016/S0168-9525(99)01706-0, 10203816
    • Brenner SE. Errors in genome annotation. Trends Genet 1999, 15(4):132-133. 10.1016/S0168-9525(99)01706-0, 10203816.
    • (1999) Trends Genet , vol.15 , Issue.4 , pp. 132-133
    • Brenner, S.E.1
  • 25
    • 78651266825 scopus 로고    scopus 로고
    • Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource
    • Sun S, et al. Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource. Database issue 2011, 39:D546-D551.
    • (2011) Database issue , vol.39
    • Sun, S.1
  • 26
    • 0025183708 scopus 로고
    • Basic local alignment search tool
    • Altschul SF, et al. Basic local alignment search tool. J Mol Biol 1990, 215(3):403-410.
    • (1990) J Mol Biol , vol.215 , Issue.3 , pp. 403-410
    • Altschul, S.F.1
  • 27
    • 3042666256 scopus 로고    scopus 로고
    • MUSCLE: multiple sequence alignment with high accuracy and high throughput
    • 10.1093/nar/gkh340, 390337, 15034147
    • Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32(5):1792-1797. 10.1093/nar/gkh340, 390337, 15034147.
    • (2004) Nucleic Acids Res , vol.32 , Issue.5 , pp. 1792-1797
    • Edgar, R.C.1
  • 28
    • 77950806408 scopus 로고    scopus 로고
    • New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0
    • 10.1093/sysbio/syq010, 20525638
    • Guindon S, et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 2010, 59(3):307-321. 10.1093/sysbio/syq010, 20525638.
    • (2010) Syst Biol , vol.59 , Issue.3 , pp. 307-321
    • Guindon, S.1
  • 29
    • 80055082271 scopus 로고    scopus 로고
    • Accelerated Profile HMM Searches
    • 10.1371/journal.pcbi.1002195, 3197634, 22039361
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol 2011, 7(10):e1002195. 10.1371/journal.pcbi.1002195, 3197634, 22039361.
    • (2011) PLoS Comput Biol , vol.7 , Issue.10
    • Eddy, S.R.1
  • 30
    • 77949718257 scopus 로고    scopus 로고
    • FastTree 2-approximately maximum-likelihood trees for large alignments
    • 10.1371/journal.pone.0009490, 2835736, 20224823
    • Price MN, Dehal PS, Arkin AP. FastTree 2-approximately maximum-likelihood trees for large alignments. PLoS One 2010, 5(3):e9490. 10.1371/journal.pone.0009490, 2835736, 20224823.
    • (2010) PLoS One , vol.5 , Issue.3
    • Price, M.N.1    Dehal, P.S.2    Arkin, A.P.3
  • 31
    • 84861898310 scopus 로고    scopus 로고
    • IMG: the Integrated Microbial Genomes database and comparative analysis system
    • 3245086, 22194640
    • Markowitz VM, et al. IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res 2012, 40(Database issue):D115-D122. 3245086, 22194640.
    • (2012) Nucleic Acids Res , vol.40 , Issue.DATABASE ISSUE
    • Markowitz, V.M.1
  • 32
    • 43349094507 scopus 로고    scopus 로고
    • The igraph software package for complex network research
    • Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal 2006, Complex Systems:1695.
    • (2006) InterJournal , vol.Complex Systems , pp. 1695
    • Csardi, G.1    Nepusz, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.