메뉴 건너뛰기




Volumn 358, Issue , 2014, Pages 31-51

Extraction of high quality k-words for alignment-free sequence comparison

Author keywords

Phylogenetic analysis; Similarity analysis; Word frequency

Indexed keywords

PHYLOGENETICS; SIMILARITY INDEX;

EID: 84902139133     PISSN: 00225193     EISSN: 10958541     Source Type: Journal    
DOI: 10.1016/j.jtbi.2014.05.016     Document Type: Article
Times cited : (8)

References (43)
  • 2
    • 0022743812 scopus 로고
    • A measure of the similarity of sets of sequences not requiring sequence alignment
    • Blaisdell B. A measure of the similarity of sets of sequences not requiring sequence alignment. Proc. Natl. Acad. Sci. USA 1986, 5155-5159.
    • (1986) Proc. Natl. Acad. Sci. USA , pp. 5155-5159
    • Blaisdell, B.1
  • 3
    • 53349103352 scopus 로고    scopus 로고
    • Approximate word matches between two random sequences
    • Burden C., Kantorovitz M., Wilson S. Approximate word matches between two random sequences. Ann. Appl. Probab. 2008, 18(1):1-21.
    • (2008) Ann. Appl. Probab. , vol.18 , Issue.1 , pp. 1-21
    • Burden, C.1    Kantorovitz, M.2    Wilson, S.3
  • 4
    • 0032703998 scopus 로고    scopus 로고
    • D2_cluster. a validated method for clustering EST and full-length cDNA sequences
    • Burke J., Davison D., Hide W. d2_cluster. a validated method for clustering EST and full-length cDNA sequences. Genome Res. 1999, 9(11):1135-1142.
    • (1999) Genome Res. , vol.9 , Issue.11 , pp. 1135-1142
    • Burke, J.1    Davison, D.2    Hide, W.3
  • 7
    • 77955082612 scopus 로고    scopus 로고
    • Ms4-multi-scale selector of sequence signatures. an alignment-free method for classification of biological sequences
    • Corel E., Pitschi F., Laprevotte I., Grasseau G., Didier G., Devauchelle C. Ms4-multi-scale selector of sequence signatures. an alignment-free method for classification of biological sequences. BMC Bioinform. 2010, 11(1):406.
    • (2010) BMC Bioinform. , vol.11 , Issue.1 , pp. 406
    • Corel, E.1    Pitschi, F.2    Laprevotte, I.3    Grasseau, G.4    Didier, G.5    Devauchelle, C.6
  • 8
    • 79951945251 scopus 로고    scopus 로고
    • Numerical characteristics of word frequencies and their application to dissimilarity measure for sequence comparison
    • Dai Q., Liu X., Yao Y., Zhao F. Numerical characteristics of word frequencies and their application to dissimilarity measure for sequence comparison. J. Theor. Biol. 2011, 276(1):174-180.
    • (2011) J. Theor. Biol. , vol.276 , Issue.1 , pp. 174-180
    • Dai, Q.1    Liu, X.2    Yao, Y.3    Zhao, F.4
  • 9
    • 0035327182 scopus 로고    scopus 로고
    • Brute force estimation of the number of human genes using est clustering as a measure
    • Davison D., Burke J. Brute force estimation of the number of human genes using est clustering as a measure. IBM J. Res. Dev. 2001, 45(3-4):439-447.
    • (2001) IBM J. Res. Dev. , vol.45 , Issue.3-4 , pp. 439-447
    • Davison, D.1    Burke, J.2
  • 11
    • 84873195021 scopus 로고    scopus 로고
    • Improving evolutionary models for mitochondrial protein data with site-class specific amino acid exchangeability matrices
    • Dunn K.A., Jiang W., Field C., Bielawski J.P. Improving evolutionary models for mitochondrial protein data with site-class specific amino acid exchangeability matrices. PloS One 2013, 8(1):e55816.
    • (2013) PloS One , vol.8 , Issue.1
    • Dunn, K.A.1    Jiang, W.2    Field, C.3    Bielawski, J.P.4
  • 13
    • 84902123979 scopus 로고
    • Phylip (phylogeny inference package), v 3.5 c.
    • Felsenstein, J., 1993. Phylip (phylogeny inference package), v 3.5 c.
    • (1993)
    • Felsenstein, J.1
  • 17
    • 0028505328 scopus 로고
    • 2, an algorithm for high-performance sequence comparison
    • 2, an algorithm for high-performance sequence comparison. J. Comput. Biol. 1994, 1(3):199-215.
    • (1994) J. Comput. Biol. , vol.1 , Issue.3 , pp. 199-215
    • Hide, W.1    Burke, J.2    Davison, D.3
  • 18
    • 34447260522 scopus 로고    scopus 로고
    • Is multiple-sequence alignment required for accurate inference of phylogeny?
    • Hohl M., Ragan M. Is multiple-sequence alignment required for accurate inference of phylogeny?. Syst. Biol. 2007, 56(2):206-221.
    • (2007) Syst. Biol. , vol.56 , Issue.2 , pp. 206-221
    • Hohl, M.1    Ragan, M.2
  • 19
    • 57149118661 scopus 로고    scopus 로고
    • Pattern-based phylogenetic distance estimation and tree reconstruction
    • Hohl M., Rigoutsos I., Ragan M. Pattern-based phylogenetic distance estimation and tree reconstruction. Evolut. Bioinform. Online 2006, 2:359-375.
    • (2006) Evolut. Bioinform. Online , vol.2 , pp. 359-375
    • Hohl, M.1    Rigoutsos, I.2    Ragan, M.3
  • 20
    • 34547844142 scopus 로고    scopus 로고
    • A statistical method for alignment-free comparison of regulatory sequences
    • Kantorovitz M., Robinson G., Sinha S. A statistical method for alignment-free comparison of regulatory sequences. Bioinformatics 2007, 23(13):i249-i255.
    • (2007) Bioinformatics , vol.23 , Issue.13
    • Kantorovitz, M.1    Robinson, G.2    Sinha, S.3
  • 21
    • 0035102453 scopus 로고    scopus 로고
    • An information-based sequence distance and its application to whole mitochondrial genome phylogeny
    • Li M., Badger J.H., Chen X., Kwong S., Kearney P., Zhang H. An information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics 2001, 17(2):149-154.
    • (2001) Bioinformatics , vol.17 , Issue.2 , pp. 149-154
    • Li, M.1    Badger, J.H.2    Chen, X.3    Kwong, S.4    Kearney, P.5    Zhang, H.6
  • 22
    • 0037195172 scopus 로고    scopus 로고
    • Distributional regimes for the number of k-word matches between two random sequences
    • Lippert R., Huang H., Waterman M. Distributional regimes for the number of k-word matches between two random sequences. Proc. Natl. Acad. Sci. 2002, 99(22):13980-13989.
    • (2002) Proc. Natl. Acad. Sci. , vol.99 , Issue.22 , pp. 13980-13989
    • Lippert, R.1    Huang, H.2    Waterman, M.3
  • 23
    • 79959929795 scopus 로고    scopus 로고
    • New powerful statistics for alignment-free sequence comparison under a pattern transfer model
    • Liu X., Wan L., Li J., Reinert G., Waterman M., Sun F. New powerful statistics for alignment-free sequence comparison under a pattern transfer model. J. Theor. Biol. 2011, 284(1):106-116.
    • (2011) J. Theor. Biol. , vol.284 , Issue.1 , pp. 106-116
    • Liu, X.1    Wan, L.2    Li, J.3    Reinert, G.4    Waterman, M.5    Sun, F.6
  • 24
    • 31144477556 scopus 로고    scopus 로고
    • Phylogenetic analysis of global hepatitis e virus sequences. genetic diversity, subtypes and zoonosis
    • Lu L., Li C., Hagedorn C. Phylogenetic analysis of global hepatitis e virus sequences. genetic diversity, subtypes and zoonosis. Rev. Med. Virol. 2006, 16(1):5-36.
    • (2006) Rev. Med. Virol. , vol.16 , Issue.1 , pp. 5-36
    • Lu, L.1    Li, C.2    Hagedorn, C.3
  • 26
    • 84902200439 scopus 로고    scopus 로고
    • Phylogen v. 1.1, Manual. [Internet] Available from: 〈〉(last accessed July 2013).
    • Rambaut, A., 2011. Phylogen v. 1.1, Manual. [Internet] Available from: 〈〉(last accessed July 2013). http://tree.bio.ed.ac.uk/software/phylogen.
    • (2011)
    • Rambaut, A.1
  • 27
    • 0030928378 scopus 로고    scopus 로고
    • Seq-gen. an application for the monte carlo simulation of DNA sequence evolution along phylogenetic trees
    • Rambaut A., Grass N.C. Seq-gen. an application for the monte carlo simulation of DNA sequence evolution along phylogenetic trees. Comput. Appl. Biosci. 1997, 13(3):235-238.
    • (1997) Comput. Appl. Biosci. , vol.13 , Issue.3 , pp. 235-238
    • Rambaut, A.1    Grass, N.C.2
  • 28
    • 75149164526 scopus 로고    scopus 로고
    • Alignment-free sequence comparison (i). statistics and power
    • Reinert G., Chew D., Sun F., Waterman M. Alignment-free sequence comparison (i). statistics and power. J. Comput. Biol. 2010, 16(12):1615-1634.
    • (2010) J. Comput. Biol. , vol.16 , Issue.12 , pp. 1615-1634
    • Reinert, G.1    Chew, D.2    Sun, F.3    Waterman, M.4
  • 29
    • 0019424782 scopus 로고
    • Comparison of phylogenetic trees
    • Robinson D., Foulds L.R. Comparison of phylogenetic trees. Math. Biosci. 1981, 53(1):131-147.
    • (1981) Math. Biosci. , vol.53 , Issue.1 , pp. 131-147
    • Robinson, D.1    Foulds, L.R.2
  • 31
    • 0003839430 scopus 로고    scopus 로고
    • Maximum Likelihood Methods in Molecular Phylogenetics
    • Ludwig Maximilian University of Munich, Germany
    • Strimmer K.S. Maximum Likelihood Methods in Molecular Phylogenetics. Ph.D. Thesis 1997, Ludwig Maximilian University of Munich, Germany.
    • (1997) Ph.D. Thesis
    • Strimmer, K.S.1
  • 33
    • 84902098992 scopus 로고    scopus 로고
    • Multivariate data analysis
    • Var I. Multivariate data analysis. Vectors 1998, 8:6.
    • (1998) Vectors , vol.8 , pp. 6
    • Var, I.1
  • 34
    • 0037342499 scopus 로고    scopus 로고
    • Alignment-free sequence comparison review
    • Vinga S., Almeida J. Alignment-free sequence comparison review. Bioinformatics 2003, 19(4):513-523.
    • (2003) Bioinformatics , vol.19 , Issue.4 , pp. 513-523
    • Vinga, S.1    Almeida, J.2
  • 35
    • 78349292948 scopus 로고    scopus 로고
    • Alignment-free sequence comparison (ii). theoretical power of comparison statistics
    • Wan L., Reinert G., Sun F., Waterman M. Alignment-free sequence comparison (ii). theoretical power of comparison statistics. J. Comput. Biol. 2010, 17(11):1467-1490.
    • (2010) J. Comput. Biol. , vol.17 , Issue.11 , pp. 1467-1490
    • Wan, L.1    Reinert, G.2    Sun, F.3    Waterman, M.4
  • 36
    • 49849083665 scopus 로고    scopus 로고
    • Wse, a new sequence distance measure based on word frequencies
    • Wang J., Zheng X. Wse, a new sequence distance measure based on word frequencies. Math. Biosci. 2008, 215(1):78-83.
    • (2008) Math. Biosci. , vol.215 , Issue.1 , pp. 78-83
    • Wang, J.1    Zheng, X.2
  • 38
    • 0031437248 scopus 로고    scopus 로고
    • A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words
    • Wu T., Burke J., Davison D. A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words. Biometrics 1997, 1431-1439.
    • (1997) Biometrics , pp. 1431-1439
    • Wu, T.1    Burke, J.2    Davison, D.3
  • 39
    • 0035013276 scopus 로고    scopus 로고
    • Statistical measures of DNA sequence dissimilarity under markov chain models of base composition
    • Wu T., Hsieh Y., Li L. Statistical measures of DNA sequence dissimilarity under markov chain models of base composition. Biometrics 2001, 441-448.
    • (2001) Biometrics , pp. 441-448
    • Wu, T.1    Hsieh, Y.2    Li, L.3
  • 40
    • 27944434972 scopus 로고    scopus 로고
    • Optimal word sizes for dissimilarity measures and estimation of the degree of dissimilarity between DNA sequences
    • Wu T., Huang Y., Li L. Optimal word sizes for dissimilarity measures and estimation of the degree of dissimilarity between DNA sequences. Bioinformatics 2005, 21(22):4125-4132.
    • (2005) Bioinformatics , vol.21 , Issue.22 , pp. 4125-4132
    • Wu, T.1    Huang, Y.2    Li, L.3
  • 41
    • 77649338338 scopus 로고    scopus 로고
    • The burrows wheeler similarity distribution between biological sequences based on burrows wheeler transform
    • Yanga L., Zhang X., Wanga T. The burrows wheeler similarity distribution between biological sequences based on burrows wheeler transform. J. Theor. Biol. 2010, 262:742-749.
    • (2010) J. Theor. Biol. , vol.262 , pp. 742-749
    • Yanga, L.1    Zhang, X.2    Wanga, T.3
  • 42
    • 84870524504 scopus 로고    scopus 로고
    • A novel statistical measure for sequence comparison on the basis of k-word counts
    • Yang X., Wang T. A novel statistical measure for sequence comparison on the basis of k-word counts. J. Theor. Biol. 2013, 318:91-100.
    • (2013) J. Theor. Biol. , vol.318 , pp. 91-100
    • Yang, X.1    Wang, T.2
  • 43
    • 83055163716 scopus 로고    scopus 로고
    • Alignment free comparison. similarity distribution between the DNA primary sequences based on the shortest absent word
    • H.Z.
    • Yang L., Zhang X., H Z. Alignment free comparison. similarity distribution between the DNA primary sequences based on the shortest absent word. J. Theor. Biol. 2012, 295:125-131.
    • (2012) J. Theor. Biol. , vol.295 , pp. 125-131
    • Yang, L.1    Zhang, X.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.