메뉴 건너뛰기




Volumn 5, Issue 8, 2013, Pages 1470-1484

Alignment-free genome tree inference by learning group-specific distance metrics

Author keywords

Alignment; Alignment free; Distance metric learning; Genome signature; Genome tree; Sequence comparison; Taxonomy

Indexed keywords

ALGORITHM; BIOLOGY; MICROBIAL GENOME; MOLECULAR EVOLUTION; PHYLOGENY; PROCEDURES; SEQUENCE ALIGNMENT; ALIGNMENT; ALIGNMENT-FREE; ARTICLE; DISTANCE METRIC LEARNING; GENOME SIGNATURE; GENOME TREE; METHODOLOGY; SEQUENCE COMPARISON; TAXONOMY;

EID: 84891649191     PISSN: None     EISSN: 17596653     Source Type: Journal    
DOI: 10.1093/gbe/evt105     Document Type: Article
Times cited : (14)

References (60)
  • 2
    • 0022743812 scopus 로고
    • A measure of the similarity of sets of sequences not requiring sequence alignment
    • Blaisdell BE. 1986. A measure of the similarity of sets of sequences not requiring sequence alignment. Proc Natl Acad Sci U S A. 83: 5155-5159.
    • (1986) Proc Natl Acad Sci U S A , vol.83 , pp. 5155-5159
    • Blaisdell, B.E.1
  • 3
    • 0026502711 scopus 로고
    • Over- and under-representation of short oligonucleotides in DNA sequences
    • Burge C, Campbell AM, Karlin S. 1992. Over- and under-representation of short oligonucleotides in DNA sequences. Proc Natl Acad Sci U S A. 89: 1358-1362.
    • (1992) Proc Natl Acad Sci U S A , vol.89 , pp. 1358-1362
    • Burge, C.1    Campbell, A.M.2    Karlin, S.3
  • 4
    • 33644700003 scopus 로고    scopus 로고
    • Toward automatic reconstruction of a highly resolved tree of life
    • Ciccarelli FD, et al. 2006. Toward automatic reconstruction of a highly resolved tree of life. Science 311:1283-1287.
    • (2006) Science , vol.311 , pp. 1283-1287
    • Ciccarelli, F.D.1
  • 6
    • 0346882654 scopus 로고    scopus 로고
    • Extracting phylogenetic information from whole-genome sequencing projects: The lactic acid bacteria as a test case
    • Coenye T, Vandamme P. 2003. Extracting phylogenetic information from whole-genome sequencing projects: the lactic acid bacteria as a test case. Microbiology 149:3507-3517.
    • (2003) Microbiology , vol.149 , pp. 3507-3517
    • Coenye, T.1    Vandamme, P.2
  • 7
    • 17744394753 scopus 로고    scopus 로고
    • Phylogenomics and the reconstruction of the tree of life
    • Delsuc F, Brinkmann H, Philippe H. 2005. Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet. 6:361-375.
    • (2005) Nat Rev Genet , vol.6 , pp. 361-375
    • Delsuc, F.1    Brinkmann, H.2    Philippe, H.3
  • 8
    • 33746061683 scopus 로고    scopus 로고
    • Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB
    • DeSantis TZ, et al. 2006. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol. 72:5069-5072.
    • (2006) Appl Environ Microbiol , vol.72 , pp. 5069-5072
    • Desantis, T.Z.1
  • 9
    • 0032823783 scopus 로고    scopus 로고
    • Genomic signature: Characterization and classification of species assessed by chaos game representation of sequences
    • Deschavanne PJ, Giron A, Vilain J, Fagot G, Fertil B. 1999. Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Mol Biol Evol. 16:1391-1399.
    • (1999) Mol Biol Evol , vol.16 , pp. 1391-1399
    • Deschavanne, P.J.1    Giron, A.2    Vilain, J.3    Fagot, G.4    Fertil, B.5
  • 10
    • 0033603539 scopus 로고    scopus 로고
    • Phylogenetic classification and the universal tree
    • Doolittle WF. 1999. Phylogenetic classification and the universal tree. Science 284:2124-2129.
    • (1999) Science , vol.284 , pp. 2124-2129
    • Doolittle, W.F.1
  • 11
    • 84963036437 scopus 로고
    • On the cophenetic correlation coefficient
    • Farris JS. 1969. On the cophenetic correlation coefficient. Syst Zool. 18: 279-285.
    • (1969) Syst Zool , vol.18 , pp. 279-285
    • Farris, J.S.1
  • 12
    • 34648815169 scopus 로고    scopus 로고
    • Prokaryote phylogeny meets taxonomy: An exhaustive comparison of composition vector trees with systematic bacteriology
    • Gao L, Qi J, Sun J, Hao B. 2007. Prokaryote phylogeny meets taxonomy: an exhaustive comparison of composition vector trees with systematic bacteriology. Sci China C Life Sci. 50:587-599.
    • (2007) Sci China C Life Sci , vol.50 , pp. 587-599
    • Gao, L.1    Qi, J.2    Sun, J.3    Hao, B.4
  • 14
    • 0042879997 scopus 로고    scopus 로고
    • Reducing the time complexity of the derand omized evolution strategy with covariance matrix adaptation (CMA-ES)
    • Hansen N, Muller SD, Koumoutsakos P. 2003. Reducing the time complexity of the derand omized evolution strategy with covariance matrix adaptation (CMA-ES). Evol Comput. 11:1-18.
    • (2003) Evol Comput , vol.11 , pp. 1-18
    • Hansen, N.1    Muller, S.D.2    Koumoutsakos, P.3
  • 15
    • 84960450653 scopus 로고    scopus 로고
    • Prokaryote phylogeny without sequence alignment: From avoidance signature to composition distance
    • Hao B, Qi J. 2003. Prokaryote phylogeny without sequence alignment: from avoidance signature to composition distance. Proc 2003 IEEE Bioinformatics Conf. 2:375-384.
    • (2003) Proc 2003 IEEE Bioinformatics Conf , vol.2 , pp. 375-384
    • Hao, B.1    Qi, J.2
  • 16
    • 0027463557 scopus 로고
    • Ribosomal RNA trees misleading
    • Hasegawa M, Hashimoto T. 1993. Ribosomal RNA trees misleading. Nature 361:23.
    • (1993) Nature , vol.361 , pp. 23
    • Hasegawa, M.1    Hashimoto, T.2
  • 18
    • 45749132147 scopus 로고    scopus 로고
    • Habitat-Lite: A GSC case study based on free text terms for environmental metadata
    • Hirschman L, et al. 2008. Habitat-Lite: A GSC case study based on free text terms for environmental metadata. OMICS 12:129-136.
    • (2008) OMICS , vol.12 , pp. 129-136
    • Hirschman, L.1
  • 19
    • 34447260522 scopus 로고    scopus 로고
    • Is multiple-sequence alignment required for accurate inference of phylogeny?
    • Höhl M, Ragan MA. 2007. Is multiple-sequence alignment required for accurate inference of phylogeny? Syst Biol. 56:206-221.
    • (2007) Syst Biol , vol.56 , pp. 206-221
    • Höhl, M.1    Ragan, M.A.2
  • 20
    • 57149118661 scopus 로고    scopus 로고
    • Pattern-based phylogenetic distance estimation and tree reconstruction
    • Höhl M, Rigoutsos I, Ragan MA. 2006. Pattern-based phylogenetic distance estimation and tree reconstruction. Evol Bioinform Online. 2: 359-375.
    • (2006) Evol Bioinform Online , vol.2 , pp. 359-375
    • Höhl, M.1    Rigoutsos, I.2    Ragan, M.A.3
  • 21
  • 22
    • 4344647970 scopus 로고    scopus 로고
    • Pervasive properties of the genomic signature
    • Jernigan RW, Baran RH. 2002. Pervasive properties of the genomic signature. BMC Genomics 3:23.
    • (2002) BMC Genomics , vol.3 , pp. 23
    • Jernigan, R.W.1    Baran, R.H.2
  • 23
    • 0014129195 scopus 로고
    • Hierarchical clustering schemes
    • Johnson SC. 1967. Hierarchical clustering schemes. Psychometrika 32: 241-254.
    • (1967) Psychometrika , vol.32 , pp. 241-254
    • Johnson, S.C.1
  • 24
    • 73049131457 scopus 로고
    • Enzymatic synthe sis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid
    • Josse J, Kaiser AD, Kornberg A. 1961. Enzymatic synthe sis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. J Biol Chem. 236:864-875.
    • (1961) J Biol Chem , vol.236 , pp. 864-875
    • Josse, J.1    Kaiser, A.D.2    Kornberg, A.3
  • 25
    • 0029060923 scopus 로고
    • Dinucleotide relative abundance extremes: A genomic signature
    • Karlin S, Burge C. 1995. Dinucleotide relative abundance extremes: a genomic signature. Trends Genet. 11:283-290.
    • (1995) Trends Genet , vol.11 , pp. 283-290
    • Karlin, S.1    Burge, C.2
  • 26
    • 0028028028 scopus 로고
    • Computational DNA-sequence analysis
    • Karlin S, Cardon LR. 1994. Computational DNA-sequence analysis. Annu Rev Microbiol. 48:619-654.
    • (1994) Annu Rev Microbiol , vol.48 , pp. 619-654
    • Karlin, S.1    Cardon, L.R.2
  • 27
    • 1842332701 scopus 로고    scopus 로고
    • Compositional biases of bacterial genomes and evolutionary implications
    • Karlin S, Mrazek J, Campbell AM. 1997. Compositional biases of bacterial genomes and evolutionary implications. J Bacteriol. 179: 3899-3913.
    • (1997) J Bacteriol , vol.179 , pp. 3899-3913
    • Karlin, S.1    Mrazek, J.2    Campbell, A.M.3
  • 28
    • 0037105996 scopus 로고    scopus 로고
    • Compositional spectrum- revealing patterns for genomic sequence characterization and comparison
    • Kirzhner V, Korol A, Bolshoy A, Nevo E. 2002. Compositional spectrum- revealing patterns for genomic sequence characterization and comparison. Physica A. 312:447-457.
    • (2002) Physica A , vol.312 , pp. 447-457
    • Kirzhner, V.1    Korol, A.2    Bolshoy, A.3    Nevo, E.4
  • 29
    • 34247844518 scopus 로고    scopus 로고
    • Different clustering of genomes across life using the A-T-C-G and degenerate R-Y alphabets: Early and late signaling on genome evolution?
    • Kirzhner V, Paz A, Volkovich Z, Nevo E, Korol A. 2007. Different clustering of genomes across life using the A-T-C-G and degenerate R-Y alphabets: early and late signaling on genome evolution? J Mol Evol. 64: 448-456.
    • (2007) J Mol Evol , vol.64 , pp. 448-456
    • Kirzhner, V.1    Paz, A.2    Volkovich, Z.3    Nevo, E.4    Korol, A.5
  • 30
    • 35648972178 scopus 로고    scopus 로고
    • Cophenetic correlation analysis as a strategy to select phylogenetically informative proteins: An example from the fungal kingdom
    • Kuramae EE, Robert V, Echavarri-Erasun C, Boekhout T. 2007. Cophenetic correlation analysis as a strategy to select phylogenetically informative proteins: an example from the fungal kingdom. BMC Evol Biol. 7:134.
    • (2007) BMC Evol Biol , vol.7 , pp. 134
    • Kuramae, E.E.1    Robert, V.2    Echavarri-Erasun, C.3    Boekhout, T.4
  • 31
    • 11944272751 scopus 로고
    • Statistical significance of the matrix correlation- coefficient for comparing independent phylogenetic trees
    • Lapointe FJ, Legendre P. 1992. Statistical significance of the matrix correlation- coefficient for comparing independent phylogenetic trees. Syst Biol. 41:378-384.
    • (1992) Syst Biol , vol.41 , pp. 378-384
    • Lapointe, F.J.1    Legendre, P.2
  • 32
    • 77956010453 scopus 로고    scopus 로고
    • Composition vector approach to wholegenome- based prokaryotic phylogeny: Success and foundations
    • Li Q, Xu Z, Hao B. 2010. Composition vector approach to wholegenome- based prokaryotic phylogeny: success and foundations. J Biotechnol. 149:115-119.
    • (2010) J Biotechnol , vol.149 , pp. 115-119
    • Li, Q.1    Xu, Z.2    Hao, B.3
  • 33
    • 35748959318 scopus 로고    scopus 로고
    • What's in the mix: Phylogenetic classification of metagenome sequence samples
    • McHardy AC, Rigoutsos I. 2007. What's in the mix: phylogenetic classification of metagenome sequence samples. Curr Opin Microbiol. 10: 499-503.
    • (2007) Curr Opin Microbiol , vol.10 , pp. 499-503
    • McHardy, A.C.1    Rigoutsos, I.2
  • 34
    • 65349189672 scopus 로고    scopus 로고
    • Phylogenetic signals in DNA composition: Limitations and prospects
    • Mrazek J. 2009. Phylogenetic signals in DNA composition: limitations and prospects. Mol Biol Evol. 26:1163-1169.
    • (2009) Mol Biol Evol , vol.26 , pp. 1163-1169
    • Mrazek, J.1
  • 35
    • 79957832529 scopus 로고    scopus 로고
    • A sub-cubic time algorithm for computing the quartet distance between two general trees
    • Nielsen J, Kristensen AK,Mailund T, Pedersen CNS. 2011. A sub-cubic time algorithm for computing the quartet distance between two general trees. Algorithms Mol Biol. 6:15.
    • (2011) Algorithms Mol Biol , vol.6 , pp. 15
    • Nielsen, J.1    Kristensen, A.K.2    Mailund, T.3    Pedersen, C.N.S.4
  • 36
    • 79952124952 scopus 로고    scopus 로고
    • Taxonomic metagenome sequence assignment with structured output models
    • Patil KR, et al. 2011. Taxonomic metagenome sequence assignment with structured output models. Nat Methods. 8:191-192.
    • (2011) Nat Methods , vol.8 , pp. 191-192
    • Patil, K.R.1
  • 37
    • 0035177259 scopus 로고    scopus 로고
    • Similarity of phylogenetic trees as indicator of protein-protein interaction
    • Pazos F, Valencia A. 2001. Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng. 14:609-614.
    • (2001) Protein Eng , vol.14 , pp. 609-614
    • Pazos, F.1    Valencia, A.2
  • 38
    • 0037315735 scopus 로고    scopus 로고
    • Evolutionary implications of microbial genome tetranucleotide frequency biases
    • Pride DT, Meinersmann RJ, Wassenaar TM, Blaser MJ. 2003. Evolutionary implications of microbial genome tetranucleotide frequency biases. Genome Res. 13:145-158.
    • (2003) Genome Res , vol.13 , pp. 145-158
    • Pride, D.T.1    Meinersmann, R.J.2    Wassenaar, T.M.3    Blaser, M.J.4
  • 39
    • 1242335920 scopus 로고    scopus 로고
    • Whole proteome prokaryote phylogeny without sequence alignment: A K-string composition approach
    • Qi J, Wang B, Hao B. 2004. Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach. J Mol Evol. 58:1-11.
    • (2004) J Mol Evol , vol.58 , pp. 1-11
    • Qi, J.1    Wang, B.2    Hao, B.3
  • 40
    • 75149164526 scopus 로고    scopus 로고
    • Alignment-free sequence comparison (I): Statistics and power
    • Reinert G, Chew D, Sun F, Waterman MS. 2009. Alignment-free sequence comparison (I): statistics and power. J Comput Biol. 16:1615-1634.
    • (2009) J Comput Biol , vol.16 , pp. 1615-1634
    • Reinert, G.1    Chew, D.2    Sun, F.3    Waterman, M.S.4
  • 41
    • 0034887748 scopus 로고    scopus 로고
    • Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier
    • Sandberg R, et al. 2001. Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res. 11: 1404-1409.
    • (2001) Genome Res , vol.11 , pp. 1404-1409
    • Sandberg, R.1
  • 42
    • 58149200954 scopus 로고    scopus 로고
    • Database resources of the National Center for Biotechnology Information
    • Sayers EW, et al. 2009. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 37:D5-15.
    • (2009) Nucleic Acids Res , vol.37
    • Sayers, E.W.1
  • 43
    • 72949107142 scopus 로고    scopus 로고
    • Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities
    • Schloss PD, et al. 2009. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 75:7537-7541.
    • (2009) Appl Environ Microbiol , vol.75 , pp. 7537-7541
    • Schloss, P.D.1
  • 44
    • 62449130191 scopus 로고    scopus 로고
    • Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions
    • Sims GE, Jun SR, Wu GA, Kim SH. 2009. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc Natl Acad Sci U S A. 106:2677-2682.
    • (2009) Proc Natl Acad Sci U S A , vol.106 , pp. 2677-2682
    • Sims, G.E.1    Jun, S.R.2    Wu, G.A.3    Kim, S.H.4
  • 45
    • 27144493223 scopus 로고    scopus 로고
    • Genome trees and the nature of genome evolution
    • Snel B, Huynen MA, Dutilh BE. 2005. Genome trees and the nature of genome evolution. Annu Rev Microbiol. 59:191-209.
    • (2005) Annu Rev Microbiol , vol.59 , pp. 191-209
    • Snel, B.1    Huynen, M.A.2    Dutilh, B.E.3
  • 46
    • 0002471149 scopus 로고
    • The comparison of dendrograms by objective methods
    • Sokal R, Rohlf J. 1962. The comparison of dendrograms by objective methods. Taxon 11:33-40.
    • (1962) Taxon , vol.11 , pp. 33-40
    • Sokal, R.1    Rohlf, J.2
  • 47
    • 0000957052 scopus 로고
    • Tests for comparing elements of a correlation matrix
    • Steiger JH. 1980. Tests for comparing elements of a correlation matrix. Psychol Bull. 87:245-251.
    • (1980) Psychol Bull , vol.87 , pp. 245-251
    • Steiger, J.H.1
  • 48
    • 78049418891 scopus 로고    scopus 로고
    • Predicting plasmid promiscuity based on genomic signature
    • Suzuki H, Yano H, Brown CJ, Top EM. 2010. Predicting plasmid promiscuity based on genomic signature. J Bacteriol. 192:6045-6055.
    • (2010) J Bacteriol , vol.192 , pp. 6045-6055
    • Suzuki, H.1    Yano, H.2    Brown, C.J.3    Top, E.M.4
  • 50
    • 67349084049 scopus 로고    scopus 로고
    • Estimation of bacterial species phylogeny through oligonucleotide frequency distances
    • Takahashi M, Kryukov K, Saitou N. 2009. Estimation of bacterial species phylogeny through oligonucleotide frequency distances. Genomics 93:525-533.
    • (2009) Genomics , vol.93 , pp. 525-533
    • Takahashi, M.1    Kryukov, K.2    Saitou, N.3
  • 51
    • 0029942146 scopus 로고    scopus 로고
    • Polyphasic taxonomy, a consensus approach to bacterial systematics
    • Vandamme P, et al. 1996. Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol Rev. 60:407-438.
    • (1996) Microbiol Rev , vol.60 , pp. 407-438
    • Vandamme, P.1
  • 52
    • 0037342499 scopus 로고    scopus 로고
    • Alignment-free sequence comparison-A review
    • Vinga S, Almeida J. 2003. Alignment-free sequence comparison-a review. Bioinformatics 19:513-523.
    • (2003) Bioinformatics , vol.19 , pp. 513-523
    • Vinga, S.1    Almeida, J.2
  • 54
    • 0001271789 scopus 로고
    • Phylogenetic structure of the prokaryotic domain: The primary kingdoms
    • WoeseCR, FoxGE. 1977. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A. 74:5088-5090.
    • (1977) Proc Natl Acad Sci U S A , vol.74 , pp. 5088-5090
    • Woese, C.R.1    Fox, G.E.2
  • 55
    • 55649110049 scopus 로고    scopus 로고
    • A simple, fast, and accurate method of phylogenomic inference
    • Wu M, Eisen JA. 2008. A simple, fast, and accurate method of phylogenomic inference. Genome Biol. 9:R151.
    • (2008) Genome Biol , vol.9
    • Wu, M.1    Eisen, J.A.2
  • 56
    • 0031437248 scopus 로고    scopus 로고
    • A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words
    • Wu T-J, Burke JP, Davison DB. 1997. A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words. Biometrics 53:1431-1439.
    • (1997) Biometrics , vol.53 , pp. 1431-1439
    • Wu, T.-J.1    Burke, J.P.2    Davison, D.B.3
  • 57
    • 27944434972 scopus 로고    scopus 로고
    • Optimal word sizes for dissimilarity measures and estimation of the degree of dissimilarity between DNA sequences
    • Wu TJ, Huang YH, Li LA. 2005. Optimal word sizes for dissimilarity measures and estimation of the degree of dissimilarity between DNA sequences. Bioinformatics 21:4125-4132.
    • (2005) Bioinformatics , vol.21 , pp. 4125-4132
    • Wu, T.J.1    Huang, Y.H.2    Li, L.A.3
  • 58
    • 85133386144 scopus 로고    scopus 로고
    • Distance metric learning, with application to clustering with side-information
    • Xing E, Ng A, Jordan M, Russell S. 2002. Distance metric learning, with application to clustering with side-information. Adv Neural Info Process Syst. 15:505-512.
    • (2002) Adv Neural Info Process Syst , vol.15 , pp. 505-512
    • Xing, E.1    Ng, A.2    Jordan, M.3    Russell, S.4
  • 59
    • 67849101838 scopus 로고    scopus 로고
    • CVTree update: A newly designed phylogenetic study platform using composition vectors and whole genomes
    • Xu Z, Hao B. 2009. CVTree update: a newly designed phylogenetic study platform using composition vectors and whole genomes. Nucleic Acids Res. 37:W174-W178.
    • (2009) Nucleic Acids Res , vol.37
    • Xu, Z.1    Hao, B.2
  • 60
    • 41149132297 scopus 로고    scopus 로고
    • Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction
    • Yang K, Zhang LQ. 2008. Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction. Nucleic Acids Res. 36:e33.
    • (2008) Nucleic Acids Res , vol.36
    • Yang, K.1    Zhang, L.Q.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.