메뉴 건너뛰기




Volumn 11, Issue , 2010, Pages

A grammar-based distance metric enables fast and accurate clustering of large sets of 16S sequences

Author keywords

[No Author keywords available]

Indexed keywords

BIOLOGICAL SEQUENCES; CPU EXECUTION TIME; GRAMMAR-BASED DISTANCES; SEQUENCE CLUSTERING; SEQUENCE CLUSTERING ALGORITHMS; STATISTICAL ACCURACY; STATISTICAL CLUSTERING; VALIDATION RESULTS;

EID: 78650145196     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/1471-2105-11-601     Document Type: Article
Times cited : (36)

References (26)
  • 1
    • 0031829372 scopus 로고    scopus 로고
    • Removing Near-Neighbour Redundancy from Large Protein Sequence Collections
    • 10.1093/bioinformatics/14.5.423, 9682055
    • Holm L, Sander C. Removing Near-Neighbour Redundancy from Large Protein Sequence Collections. Bioinformatics 1998, 14(5):423-429. 10.1093/bioinformatics/14.5.423, 9682055.
    • (1998) Bioinformatics , vol.14 , Issue.5 , pp. 423-429
    • Holm, L.1    Sander, C.2
  • 2
    • 0035072551 scopus 로고    scopus 로고
    • Clustering of Highly Homologous Sequences to Reduce the Size of Large Protein Databases
    • 10.1093/bioinformatics/17.3.282, 11294794
    • Li W, Jaroszewski L, Godzik A. Clustering of Highly Homologous Sequences to Reduce the Size of Large Protein Databases. Bioinformatics 2001, 17(3):282-283. 10.1093/bioinformatics/17.3.282, 11294794.
    • (2001) Bioinformatics , vol.17 , Issue.3 , pp. 282-283
    • Li, W.1    Jaroszewski, L.2    Godzik, A.3
  • 3
    • 0036169928 scopus 로고    scopus 로고
    • Tolerating some Redundancy Significantly Speeds up Clustering of Large Protein Databases
    • 10.1093/bioinformatics/18.1.77, 11836214
    • Li W, Jaroszewski L, Godzik A. Tolerating some Redundancy Significantly Speeds up Clustering of Large Protein Databases. Bioinformatics 2002, 18:77-82. 10.1093/bioinformatics/18.1.77, 11836214.
    • (2002) Bioinformatics , vol.18 , pp. 77-82
    • Li, W.1    Jaroszewski, L.2    Godzik, A.3
  • 4
    • 0029565637 scopus 로고
    • Improved Tools for DNA Comparison and Clustering
    • Parsons JD. Improved Tools for DNA Comparison and Clustering. Computer Applications in the Biosciences 1995, 11(6):603-613.
    • (1995) Computer Applications in the Biosciences , vol.11 , Issue.6 , pp. 603-613
    • Parsons, J.D.1
  • 5
    • 77952039847 scopus 로고    scopus 로고
    • Sequence Embedding for Fast Construction of Guide Trees for Multiple Sequence Alignment
    • 2893182, 20470396
    • Blackshields G, Sievers F, Shi W, Wilm A, Higgins DG. Sequence Embedding for Fast Construction of Guide Trees for Multiple Sequence Alignment. Algorithms for Molecular Biology 2010, 5(21). 2893182, 20470396.
    • (2010) Algorithms for Molecular Biology , vol.5 , Issue.21
    • Blackshields, G.1    Sievers, F.2    Shi, W.3    Wilm, A.4    Higgins, D.G.5
  • 6
    • 33745634395 scopus 로고    scopus 로고
    • Cd-hit: a Fast Program for Clustering and Comparing Large Sets of Protein or Nucleotide Sequences
    • 10.1093/bioinformatics/btl158, 16731699
    • Li W, Godzik A. Cd-hit: a Fast Program for Clustering and Comparing Large Sets of Protein or Nucleotide Sequences. Bioinformatics 2006, 22(13):1658-1659. 10.1093/bioinformatics/btl158, 16731699.
    • (2006) Bioinformatics , vol.22 , Issue.13 , pp. 1658-1659
    • Li, W.1    Godzik, A.2
  • 7
    • 72949091232 scopus 로고    scopus 로고
    • Bacterial Community Variation in Human Body Habitats Across Space and Time
    • 10.1126/science.1177486, 19892944
    • Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial Community Variation in Human Body Habitats Across Space and Time. Science 2009, 326:1694-1697. 10.1126/science.1177486, 19892944.
    • (2009) Science , vol.326 , pp. 1694-1697
    • Costello, E.K.1    Lauber, C.L.2    Hamady, M.3    Fierer, N.4    Gordon, J.I.5    Knight, R.6
  • 8
    • 77957244650 scopus 로고    scopus 로고
    • Search and Clustering Orders of Magnitude Faster than BLAST
    • 10.1093/bioinformatics/btq461, 20709691
    • Edgar RC. Search and Clustering Orders of Magnitude Faster than BLAST. Bioinformatics 2010, 26(19):2460-2461. 10.1093/bioinformatics/btq461, 20709691.
    • (2010) Bioinformatics , vol.26 , Issue.19 , pp. 2460-2461
    • Edgar, R.C.1
  • 9
    • 0030801002 scopus 로고    scopus 로고
    • Gapped BLAST and PSI-BLAST: a New Generation of Protein Database Search Programs
    • 10.1093/nar/25.17.3389, 146917, 9254694
    • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a New Generation of Protein Database Search Programs. Nucleic Acids Research 1997, 25(17):3389-3402. 10.1093/nar/25.17.3389, 146917, 9254694.
    • (1997) Nucleic Acids Research , vol.25 , Issue.17 , pp. 3389-3402
    • Altschul, S.F.1    Madden, T.L.2    Schaffer, A.A.3    Zhang, J.4    Zhang, Z.5    Miller, W.6    Lipman, D.J.7
  • 11
    • 0000523223 scopus 로고    scopus 로고
    • Compression and Explanation using Hierarchical Grammars
    • Nevill-Manning CG, Witten IH. Compression and Explanation using Hierarchical Grammars. The Computer Journal 1997, 40(2/3):103-116.
    • (1997) The Computer Journal , vol.40 , Issue.2-3 , pp. 103-116
    • Nevill-Manning, C.G.1    Witten, I.H.2
  • 12
    • 0017493286 scopus 로고
    • A Universal Algorithm for Sequential Data Compression
    • Ziv J, Lempel A. A Universal Algorithm for Sequential Data Compression. IEEE Transactions on Information Theory 1977, 23(3):337-343.
    • (1977) IEEE Transactions on Information Theory , vol.23 , Issue.3 , pp. 337-343
    • Ziv, J.1    Lempel, A.2
  • 14
    • 0018019231 scopus 로고
    • Compression of Individual Sequences via Variable-Rate Coding
    • Ziv J, Lempel A. Compression of Individual Sequences via Variable-Rate Coding. IEEE Transactions on Information Theory 1978, 24(5):530-536.
    • (1978) IEEE Transactions on Information Theory , vol.24 , Issue.5 , pp. 530-536
    • Ziv, J.1    Lempel, A.2
  • 15
    • 0037185399 scopus 로고    scopus 로고
    • Language Trees and Zipping
    • 10.1103/PhysRevLett.88.048702, 11801178
    • Benedetto D, Caglioti E, Loreto V. Language Trees and Zipping. Physical Review Letters 2002, 88(4). 10.1103/PhysRevLett.88.048702, 11801178.
    • (2002) Physical Review Letters , vol.88 , Issue.4
    • Benedetto, D.1    Caglioti, E.2    Loreto, V.3
  • 16
    • 0242643741 scopus 로고    scopus 로고
    • A New Sequence Distance Measure for Phylogenetic Tree Construction
    • 10.1093/bioinformatics/btg295, 14594718
    • Otu HH, Sayood K. A New Sequence Distance Measure for Phylogenetic Tree Construction. Bioinformatics 2003, 19(16):2122-2130. 10.1093/bioinformatics/btg295, 14594718.
    • (2003) Bioinformatics , vol.19 , Issue.16 , pp. 2122-2130
    • Otu, H.H.1    Sayood, K.2
  • 17
    • 47949119484 scopus 로고    scopus 로고
    • Grammar-Based Distance in Progressive Multiple Sequence Alignment
    • 2478692, 18616828
    • Russell DJ, Otu HH, Sayood K. Grammar-Based Distance in Progressive Multiple Sequence Alignment. BMC Bioinformatics 2008, 9(306). 2478692, 18616828.
    • (2008) BMC Bioinformatics , vol.9 , Issue.306
    • Russell, D.J.1    Otu, H.H.2    Sayood, K.3
  • 19
    • 1842535438 scopus 로고    scopus 로고
    • Utilization of the Relative Complexity Measure to Construct a Phylogenetic Tree for Fungi
    • 10.1017/S0953756203009079, 15119348
    • Bastola DR, Otu HH, Doukas SE, Sayood K, Hinrichs SH, Iwen PC. Utilization of the Relative Complexity Measure to Construct a Phylogenetic Tree for Fungi. Mycological Research 2004, 108(2):117-125. 10.1017/S0953756203009079, 15119348.
    • (2004) Mycological Research , vol.108 , Issue.2 , pp. 117-125
    • Bastola, D.R.1    Otu, H.H.2    Doukas, S.E.3    Sayood, K.4    Hinrichs, S.H.5    Iwen, P.C.6
  • 21
    • 0016942292 scopus 로고
    • A Space-Economical Suffix Tree Construction Algorithm
    • McCreight EM. A Space-Economical Suffix Tree Construction Algorithm. Journal of the ACM 1976, 23(2):262-272.
    • (1976) Journal of the ACM , vol.23 , Issue.2 , pp. 262-272
    • McCreight, E.M.1
  • 22
    • 0001704377 scopus 로고
    • On-Line Construction of Suffix Trees
    • Ukkonen E. On-Line Construction of Suffix Trees. Algorithmica 1995, 14(3):249-260.
    • (1995) Algorithmica , vol.14 , Issue.3 , pp. 249-260
    • Ukkonen, E.1
  • 24
    • 0027968068 scopus 로고
    • CLUSTAL W: Improving the Sensitivity of Progressive Multiple Sequence Alignment Through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice
    • 10.1093/nar/22.22.4673, 308517, 7984417
    • Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: Improving the Sensitivity of Progressive Multiple Sequence Alignment Through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice. Nucleic Acids Research 1994, 22(22):4673-4680. 10.1093/nar/22.22.4673, 308517, 7984417.
    • (1994) Nucleic Acids Research , vol.22 , Issue.22 , pp. 4673-4680
    • Thompson, J.D.1    Higgins, D.G.2    Gibson, T.J.3
  • 26
    • 71749111168 scopus 로고    scopus 로고
    • Analysis and Comparison of Very Large Metagenomes with Fast Clustering and Functional Annotation
    • Li W. Analysis and Comparison of Very Large Metagenomes with Fast Clustering and Functional Annotation. BMC Bioinformatics 2009, 10(359).
    • (2009) BMC Bioinformatics , vol.10 , Issue.359
    • Li, W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.