메뉴 건너뛰기




Volumn 30, Issue 14, 2014, Pages 1991-1999

Fast alignment-free sequence comparison using spaced-word frequencies

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHM; ANIMAL; ARTICLE; DNA SEQUENCE; GENOMICS; METHODOLOGY; MITOCHONDRIAL GENOME; PHYLOGENY; PLANT GENOME; PRIMATE; SEQUENCE ALIGNMENT; SEQUENCE ANALYSIS;

EID: 84904013371     PISSN: 13674803     EISSN: 14602059     Source Type: Journal    
DOI: 10.1093/bioinformatics/btu177     Document Type: Article
Times cited : (113)

References (52)
  • 1
    • 0025183708 scopus 로고
    • Basic local alignment search tool
    • Altschul, S.F. et al. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403-410.
    • (1990) J. Mol. Biol. , vol.215 , pp. 403-410
    • Altschul, S.F.1
  • 2
    • 77952039847 scopus 로고    scopus 로고
    • Sequence embedding for fast construction of guide trees for multiple sequence alignment
    • Blackshields, G. et al. (2010) Sequence embedding for fast construction of guide trees for multiple sequence alignment. Algorithms Mol. Biol., 5, 21.
    • (2010) Algorithms Mol. Biol. , vol.5 , pp. 21
    • Blackshields, G.1
  • 4
    • 68249123831 scopus 로고    scopus 로고
    • A survey of seeding for sequence alignment
    • Mǎndoiu, I. and Zelikovsky, A. (eds) Wiley-Interscience, New York
    • Brown, D.G. (2008) A survey of seeding for sequence alignment. In: Mǎndoiu, I. and Zelikovsky, A. (eds) Bioinformatics Algorithms: Techniques and Applications. Wiley-Interscience, New York, pp. 126-152.
    • (2008) Bioinformatics Algorithms: Techniques and Applications , pp. 126-152
    • Brown, D.G.1
  • 5
    • 75349103706 scopus 로고    scopus 로고
    • Genomic DNA k-mer spectra: Models and modalities
    • Chor, B. et al. (2009) Genomic DNA k-mer spectra: models and modalities. Genome Biol., 10, R108.
    • (2009) Genome Biol. , vol.10
    • Chor, B.1
  • 6
    • 0031177375 scopus 로고    scopus 로고
    • Recursive hashing functions for n-grams
    • Cohen, J.D. (1997) Recursive hashing functions for n-grams. ACM Trans. Inf. Syst., 15, 291-320.
    • (1997) ACM Trans. Inf. Syst. , vol.15 , pp. 291-320
    • Cohen, J.D.1
  • 7
    • 84870417740 scopus 로고    scopus 로고
    • Alignment-free phylogeny of whole genomes using underlying subwords
    • Comin, M. and Verzotto, D. (2012) Alignment-free phylogeny of whole genomes using underlying subwords. Algorithms Mol. Biol., 7, 34.
    • (2012) Algorithms Mol. Biol. , vol.7 , pp. 34
    • Comin, M.1    Verzotto, D.2
  • 8
    • 77955082612 scopus 로고    scopus 로고
    • MS4-multi-scale selector of sequence signatures: An alignment-free method for classification of biological sequences
    • Corel, E. et al. (2010) MS4-multi-scale selector of sequence signatures: an alignment-free method for classification of biological sequences. BMC Bioinformatics, 11, 406.
    • (2010) BMC Bioinformatics , vol.11 , pp. 406
    • Corel, E.1
  • 9
    • 77956193448 scopus 로고    scopus 로고
    • Progressive Mauve: Multiple genome alignment with gene gain, loss and rearrangement
    • Darling, A.E. et al. (2010) progressive Mauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One, 5, e11147.
    • (2010) PLoS One , vol.5
    • Darling, A.E.1
  • 10
    • 0000228203 scopus 로고
    • A model of evolutionary change in proteins
    • Dayhoff, M. et al. (1978) A model of evolutionary change in proteins. Atlas Protein Seq. Struct., 6, 345-362.
    • (1978) Atlas Protein Seq. Struct. , vol.6 , pp. 345-362
    • Dayhoff, M.1
  • 12
    • 33846639673 scopus 로고    scopus 로고
    • Comparing sequences without using alignments: Application to HIV/SIV subtyping
    • Didier, G. et al. (2007) Comparing sequences without using alignments: application to HIV/SIV subtyping. BMC Bioinformatics, 8, 1.
    • (2007) BMC Bioinformatics , vol.8 , pp. 1
    • Didier, G.1
  • 13
    • 84867846845 scopus 로고    scopus 로고
    • Variable length local decoding and alignment-free sequence comparison
    • Didier, G. et al. (2012) Variable length local decoding and alignment-free sequence comparison. Theor. Comput. Sci., 462, 1-11.
    • (2012) Theor. Comput. Sci. , vol.462 , pp. 1-11
    • Didier, G.1
  • 14
    • 68849131258 scopus 로고    scopus 로고
    • HaMStR: Profile hidden markov model based search for orthologs in ESTs
    • Ebersberger, I. et al. (2009) HaMStR: profile hidden markov model based search for orthologs in ESTs. BMC Evol. Biol., 9, 157.
    • (2009) BMC Evol. Biol. , vol.9 , pp. 157
    • Ebersberger, I.1
  • 15
    • 3042666256 scopus 로고    scopus 로고
    • MUSCLE: Multiple sequence alignment with high score accuracy and high throughput
    • Edgar, R.C. (2004) MUSCLE: Multiple sequence alignment with high score accuracy and high throughput. Nucleic Acids Res., 32, 1792-1797.
    • (2004) Nucleic Acids Res. , vol.32 , pp. 1792-1797
    • Edgar, R.C.1
  • 16
    • 0019797407 scopus 로고
    • Evolutionary trees from DNA sequences: A maximum likelihood approach
    • Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol., 17, 368-376.
    • (1981) J. Mol. Evol. , vol.17 , pp. 368-376
    • Felsenstein, J.1
  • 17
    • 0000122573 scopus 로고
    • PHYLIP-phylogeny inference package (Version 3.2)
    • Felsenstein, J. (1989) PHYLIP-Phylogeny Inference Package (Version 3.2). Cladistics, 5, 164-166.
    • (1989) Cladistics , vol.5 , pp. 164-166
    • Felsenstein, J.1
  • 19
    • 84857867828 scopus 로고    scopus 로고
    • Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts
    • Göke, J. et al. (2012) Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts. Bioinformatics, 28, 656-663.
    • (2012) Bioinformatics , vol.28 , pp. 656-663
    • Göke, J.1
  • 20
    • 84876551978 scopus 로고    scopus 로고
    • A phylogenetic analysis of the brassicales clade based on an alignment-free sequence comparison method
    • Hatje, K. and Kollmar, M. (2012) A phylogenetic analysis of the brassicales clade based on an alignment-free sequence comparison method. Front. Plant Sci., 3, 192.
    • (2012) Front. Plant Sci. , vol.3 , pp. 192
    • Hatje, K.1    Kollmar, M.2
  • 21
    • 25444513996 scopus 로고    scopus 로고
    • Genome comparison without alignment using shortest unique substrings
    • Haubold, B. et al. (2005) Genome comparison without alignment using shortest unique substrings. BMC Bioinformatics, 6, 123.
    • (2005) BMC Bioinformatics , vol.6 , pp. 123
    • Haubold, B.1
  • 22
    • 70349789710 scopus 로고    scopus 로고
    • Estimating mutation distances from unaligned genomes
    • Haubold, B. et al. (2009) Estimating mutation distances from unaligned genomes. J. Comput. Biol., 16, 1487-1500.
    • (2009) J. Comput. Biol. , vol.16 , pp. 1487-1500
    • Haubold, B.1
  • 23
    • 84883410459 scopus 로고    scopus 로고
    • KClust: Fast and sensitive clustering of large protein sequence databases
    • Hauser, M. et al. (2013) kClust: fast and sensitive clustering of large protein sequence databases. BMC Bioinformatics, 14, 248.
    • (2013) BMC Bioinformatics , vol.14 , pp. 248
    • Hauser, M.1
  • 24
    • 57149118661 scopus 로고    scopus 로고
    • Pattern-based phylogenetic distance estimation and tree reconstruction
    • Höhl, M. et al. (2006) Pattern-based phylogenetic distance estimation and tree reconstruction. Evol. Bioinform. Online, 2, 359-375.
    • (2006) Evol. Bioinform. Online , vol.2 , pp. 359-375
    • Höhl, M.1
  • 25
    • 84903956591 scopus 로고    scopus 로고
    • Spaced words and kmacs: Fast alignment-free sequence comparison based on inexact word matches
    • doi: 10.1093/nar/gku398
    • Horwege, S. et al. (2014) Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches. Nucleic Acids Res., doi: 10.1093/nar/gku398.
    • (2014) Nucleic Acids Res.
    • Horwege, S.1
  • 26
    • 0022030599 scopus 로고
    • Efficient randomized pattern-matching algorithms
    • Karp, R.M. and Rabin, M.O. (1987) Efficient randomized pattern-matching algorithms. IBM J. Res. Dev., 31, 249-260.
    • (1987) IBM J. Res. Dev. , vol.31 , pp. 249-260
    • Karp, R.M.1    Rabin, M.O.2
  • 27
    • 0037100671 scopus 로고    scopus 로고
    • MAFFT: A novel method for rapid multiple sequence alignment based on fast fourier transform
    • Katoh, K. et al. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res., 30, 3059-3066.
    • (2002) Nucleic Acids Res. , vol.30 , pp. 3059-3066
    • Katoh, K.1
  • 28
    • 1642632854 scopus 로고    scopus 로고
    • On spaced seeds for similarity search
    • Keich, U. et al. (2004) On spaced seeds for similarity search. Discrete Appl. Math., 138, 253-263.
    • (2004) Discrete Appl. Math. , vol.138 , pp. 253-263
    • Keich, U.1
  • 29
    • 84866159705 scopus 로고    scopus 로고
    • Alignment-free distance measure based on return time distribution for sequence analysis: Applications to clustering, molecular phylogeny and subtyping
    • Kolekar, P. et al. (2012) Alignment-free distance measure based on return time distribution for sequence analysis: applications to clustering, molecular phylogeny and subtyping. Mol. Phylogenet. Evol., 65, 510-522.
    • (2012) Mol. Phylogenet. Evol. , vol.65 , pp. 510-522
    • Kolekar, P.1
  • 30
    • 0348015658 scopus 로고
    • The kullbackleibler distance
    • Kullback, S. (1987) The kullbackleibler distance. Am. Stat., 41, 340-341.
    • (1987) Am. Stat. , vol.41 , pp. 340-341
    • Kullback, S.1
  • 31
    • 84903977588 scopus 로고    scopus 로고
    • Kmacs: The k-mismatch average common substring approach to alignment-free sequence comparison
    • Leimeister, C.-A. and Morgenstern, B. (2014) kmacs: the k-mismatch average common substring approach to alignment-free sequence comparison. Bioinformatics, 30, 2000-2008.
    • (2014) Bioinformatics , vol.30 , pp. 2000-2008
    • Leimeister, C.-A.1    Morgenstern, B.2
  • 32
    • 14944363170 scopus 로고    scopus 로고
    • PatternHunter II: Highly sensitive and fast homology search
    • Li, M. et al. (2003) PatternHunter II: highly sensitive and fast homology search. Genome Inform., 14, 164-175.
    • (2003) Genome Inform. , vol.14 , pp. 164-175
    • Li, M.1
  • 33
    • 0025952277 scopus 로고
    • Divergence measures based on the shannon entropy
    • Lin, J. (1991) Divergence measures based on the shannon entropy. IEEE Trans. Inf. Theory, 37, 145-151.
    • (1991) IEEE Trans. Inf. Theory , vol.37 , pp. 145-151
    • Lin, J.1
  • 34
    • 33748682730 scopus 로고    scopus 로고
    • Remote homology detection based on oligomer distances
    • Lingner, T. and Meinicke, P. (2006) Remote homology detection based on oligomer distances. Bioinformatics, 22, 2224-2231.
    • (2006) Bioinformatics , vol.22 , pp. 2224-2231
    • Lingner, T.1    Meinicke, P.2
  • 35
    • 0036202921 scopus 로고    scopus 로고
    • PatternHunter: Faster and more sensitive homology search
    • Ma, B. et al. (2002) PatternHunter: faster and more sensitive homology search. Bioinformatics, 18, 440-445.
    • (2002) Bioinformatics , vol.18 , pp. 440-445
    • Ma, B.1
  • 36
    • 31244432403 scopus 로고    scopus 로고
    • A simple and space-efficient fragment-chaining algorithm for alignment of DNA and protein sequences
    • Morgenstern, B. (2002) A simple and space-efficient fragment-chaining algorithm for alignment of DNA and protein sequences. Appl. Math. Lett., 15, 11-16.
    • (2002) Appl. Math. Lett. , vol.15 , pp. 11-16
    • Morgenstern, B.1
  • 37
    • 0014757386 scopus 로고
    • A general method applicable to the search for similarities in the amino acid sequence of two proteins
    • Needleman, S.B. and Wunsch, C.D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48, 443-453.
    • (1970) J. Mol. Biol. , vol.48 , pp. 443-453
    • Needleman, S.B.1    Wunsch, C.D.2
  • 39
    • 0019424782 scopus 로고
    • Comparison of phylogenetic trees
    • Robinson, D. and Foulds, L. (1981) Comparison of phylogenetic trees. Math. Biosci., 53, 131-147.
    • (1981) Math. Biosci. , vol.53 , pp. 131-147
    • Robinson, D.1    Foulds, L.2
  • 40
    • 0023375195 scopus 로고
    • The neighbor-joining method: A new method for reconstructing phylogenetic trees
    • Saitou, N. and Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol., 4, 406-425.
    • (1987) Mol. Biol. Evol. , vol.4 , pp. 406-425
    • Saitou, N.1    Nei, M.2
  • 41
    • 70350442763 scopus 로고    scopus 로고
    • Orthoselect: A protocol for selecting orthologous groups in phylogenomics
    • Schreiber, F. et al. (2009) Orthoselect: a protocol for selecting orthologous groups in phylogenomics. BMC Bioinformatics, 10, 219.
    • (2009) BMC Bioinformatics , vol.10 , pp. 219
    • Schreiber, F.1
  • 42
    • 80054078476 scopus 로고    scopus 로고
    • Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega
    • Sievers, F. et al. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol., 7, 539.
    • (2011) Mol. Syst. Biol. , vol.7 , pp. 539
    • Sievers, F.1
  • 43
    • 62449130191 scopus 로고    scopus 로고
    • Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions
    • Sims, G.E. et al. (2009) Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc. Natl Acad. Sci. USA, 106, 2677-2682.
    • (2009) Proc. Natl Acad. Sci. USA , vol.106 , pp. 2677-2682
    • Sims, G.E.1
  • 44
    • 0000825481 scopus 로고
    • A statistical method for evaluating systematic relationships
    • Sokal, R.R. and Michener, C.D. (1958)A Statistical Method for Evaluating Systematic Relationships. University of Kansas Science Bulletin, 38, 1409-1438.
    • (1958) University of Kansas Science Bulletin , vol.38 , pp. 1409-1438
    • Sokal, R.R.1    Michener, C.D.2
  • 45
    • 84873598282 scopus 로고    scopus 로고
    • Alignment-free sequence comparison based on next generation sequencing reads
    • Song, K. et al. (2013) Alignment-free sequence comparison based on next generation sequencing reads. J. Comput. Biol, 20, 64-79.
    • (2013) J. Comput. Biol , vol.20 , pp. 64-79
    • Song, K.1
  • 46
    • 0031857684 scopus 로고    scopus 로고
    • Rose: Generating sequence families
    • Stoye, J. et al. (1998) Rose: generating sequence families. Bioinformatics, 14, 157-163.
    • (1998) Bioinformatics , vol.14 , pp. 157-163
    • Stoye, J.1
  • 47
    • 0027968068 scopus 로고
    • CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
    • Thompson, J.D. et al. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673-4680.
    • (1994) Nucleic Acids Res. , vol.22 , pp. 4673-4680
    • Thompson, J.D.1
  • 48
    • 24644457706 scopus 로고    scopus 로고
    • BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark
    • Thompson, J.D. et al. (2005) BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins, 61, 127-136.
    • (2005) Proteins , vol.61 , pp. 127-136
    • Thompson, J.D.1
  • 49
    • 33646005790 scopus 로고    scopus 로고
    • The average common substring approach to phylogenomic reconstruction
    • Ulitsky, I. et al. (2006) The average common substring approach to phylogenomic reconstruction. J. Comput. Biol., 13, 336-350.
    • (2006) J. Comput. Biol. , vol.13 , pp. 336-350
    • Ulitsky, I.1
  • 50
    • 84899981214 scopus 로고    scopus 로고
    • Hashing concepts and the java programming language
    • University of Auckland
    • Uzgalis, R. (1996) Hashing concepts and the java programming language. In: Technical report. University of Auckland.
    • (1996) Technical Report
    • Uzgalis, R.1
  • 51
    • 0037342499 scopus 로고    scopus 로고
    • Alignment-free sequence comparison-A review
    • Vinga, S. and Almeida, J. (2003) Alignment-free sequence comparison-a review. Bioinformatics, 19, 513-523.
    • (2003) Bioinformatics , vol.19 , pp. 513-523
    • Vinga, S.1    Almeida, J.2
  • 52
    • 84860351022 scopus 로고    scopus 로고
    • Pattern matching through Chaos Game Representation: Bridging numerical and discrete data structures for biological sequence analysis
    • Vinga, S. et al. (2012) Pattern matching through Chaos Game Representation: bridging numerical and discrete data structures for biological sequence analysis. Algorithms Mol. Biol., 7, 10.
    • (2012) Algorithms Mol. Biol. , vol.7 , pp. 10
    • Vinga, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.