메뉴 건너뛰기




Volumn 81, Issue 2, 2006, Pages 137-153

n-Gram-based classification and unsupervised hierarchical clustering of genome sequences

Author keywords

Classification; Genome sequence; Hierarchical clustering; n Gram

Indexed keywords

COMPUTER AIDED DIAGNOSIS; GENETIC ENGINEERING; PROBLEM SOLVING; STATISTICAL METHODS;

EID: 31344463462     PISSN: 01692607     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.cmpb.2005.11.007     Document Type: Article
Times cited : (92)

References (42)
  • 1
    • 0023375195 scopus 로고
    • The neighbor-joining method: A new method for reconstructing phylogenetic trees
    • N. Saitou, and M. Nei The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol. 4 1987 406 425
    • (1987) Mol. Biol. Evol. , vol.4 , pp. 406-425
    • Saitou, N.1    Nei, M.2
  • 3
    • 0000122573 scopus 로고
    • PHYLIP: Phylogeny inference package (version 3.2)
    • J. Felsenstein PHYLIP: phylogeny inference package (version 3.2) Cladistics 5 1989 164 166
    • (1989) Cladistics , vol.5 , pp. 164-166
    • Felsenstein, J.1
  • 5
    • 0000116740 scopus 로고
    • A method for deducing branching sequences in phylogeny
    • J.H. Camin, and R.R. Sokal A method for deducing branching sequences in phylogeny Evolution 19 1965 311 327
    • (1965) Evolution , vol.19 , pp. 311-327
    • Camin, J.H.1    Sokal, R.R.2
  • 6
    • 0031686899 scopus 로고    scopus 로고
    • New phylogenetic venues opened by a novel implementation of the dnaml algorithm
    • O. Trelles, C. Ceron, H.C. Wang, J. Dopazo, and J.M. Carazo New phylogenetic venues opened by a novel implementation of the dnaml algorithm Bioinformatics 14 1998 544 545
    • (1998) Bioinformatics , vol.14 , pp. 544-545
    • Trelles, O.1    Ceron, C.2    Wang, H.C.3    Dopazo, J.4    Carazo, J.M.5
  • 7
    • 0034818755 scopus 로고    scopus 로고
    • Taking variation of evolutionary rates between sites into account in inferring phylogenies
    • J. Felsenstein Taking variation of evolutionary rates between sites into account in inferring phylogenies J. Mol. Evol. 53 2001 447 455
    • (2001) J. Mol. Evol. , vol.53 , pp. 447-455
    • Felsenstein, J.1
  • 8
    • 0027957393 scopus 로고
    • Fastdnaml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood
    • G.J. Olsen, H. Matsuda, R. Hagstrom, and R. Overbeek Fastdnaml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood Comput. Appl. Biosci. 10 1994 41 48
    • (1994) Comput. Appl. Biosci. , vol.10 , pp. 41-48
    • Olsen, G.J.1    Matsuda, H.2    Hagstrom, R.3    Overbeek, R.4
  • 9
    • 3242890106 scopus 로고    scopus 로고
    • CVTree: A phylogenetic tree reconstruction tool based on whole genomes
    • J. Qi, H. Luo, and B. Hao CVTree: a phylogenetic tree reconstruction tool based on whole genomes Nucleic Acids Res. 32 2004 45 47
    • (2004) Nucleic Acids Res. , vol.32 , pp. 45-47
    • Qi, J.1    Luo, H.2    Hao, B.3
  • 11
    • 11144231530 scopus 로고    scopus 로고
    • Department of Computer Science, University of Missouri-Rolla
    • D. Tauritz, Application of n-Grams, Department of Computer Science, University of Missouri-Rolla, 2002.
    • (2002) Application of n-Grams
    • Tauritz, D.1
  • 12
    • 0023255153 scopus 로고    scopus 로고
    • Effective text compression with simultaneous digram and trigram encoding
    • J.L. Wisniewski Effective text compression with simultaneous digram and trigram encoding J. Inform. Sci. 13 1997 159 164
    • (1997) J. Inform. Sci. , vol.13 , pp. 159-164
    • Wisniewski, J.L.1
  • 13
    • 0020685593 scopus 로고
    • Automatic spelling correction using trigram similarity measure
    • R.C. Angell, G.E. Freund, and P. Willett Automatic spelling correction using trigram similarity measure Inform. Process. Manage. 19 1983 255 261
    • (1983) Inform. Process. Manage. , vol.19 , pp. 255-261
    • Angell, R.C.1    Freund, G.E.2    Willett, P.3
  • 14
  • 16
    • 3843083229 scopus 로고
    • Experiments with syntactic traces in information retrieval
    • T. De Heer Experiments with syntactic traces in information retrieval Inform. Storage Retrieval 10 1974 133 144
    • (1974) Inform. Storage Retrieval , vol.10 , pp. 133-144
    • De Heer, T.1
  • 17
    • 31344472238 scopus 로고    scopus 로고
    • Trigram-based method of language identification, U.S. Patent 5,062,143 (October 1991).
    • J.C. Schmitt, Trigram-based method of language identification, U.S. Patent 5,062,143 (October 1991).
    • Schmitt, J.C.1
  • 20
    • 0034593307 scopus 로고    scopus 로고
    • Characterizing the behavior of a program using multiple-length n-grams
    • Ballycotton, County Cork, Ireland
    • C. Marceau Characterizing the behavior of a program using multiple-length n-grams Proceedings of the 2000 Workshop on New Security Paradigms Ballycotton, County Cork, Ireland 2001 101 110
    • (2001) Proceedings of the 2000 Workshop on New Security Paradigms , pp. 101-110
    • Marceau, C.1
  • 22
    • 0027512932 scopus 로고
    • A novel method of protein sequence classification based on oligopeptide frequency analysis and its application to search for functional sites and to domain localization
    • V.V. Solovyev, and K.S. Makarova A novel method of protein sequence classification based on oligopeptide frequency analysis and its application to search for functional sites and to domain localization Comput. Appl. Biosci. 9 1993 17 24
    • (1993) Comput. Appl. Biosci. , vol.9 , pp. 17-24
    • Solovyev, V.V.1    Makarova, K.S.2
  • 23
    • 13944255457 scopus 로고    scopus 로고
    • Protein classification based on text document classification techniques
    • B.Y. Cheng, J.G. Carbonell, and J. Klein-Seetharaman Protein classification based on text document classification techniques Proteins 58 2005 955 970
    • (2005) Proteins , vol.58 , pp. 955-970
    • Cheng, B.Y.1    Carbonell, J.G.2    Klein-Seetharaman, J.3
  • 24
    • 0028911698 scopus 로고
    • Gauging similarity with n-grams: Language-independent categorization of text
    • M. Damashek Gauging similarity with n-grams: language-independent categorization of text Science 267 1995 843 848
    • (1995) Science , vol.267 , pp. 843-848
    • Damashek, M.1
  • 25
    • 34250337525 scopus 로고    scopus 로고
    • National Cancer Institute
    • Dictionary of Cancer Terms, National Cancer Institute, on-line at (last accessed March 2005): http://www.cancer.gov/dictionary.
    • Dictionary of Cancer Terms
  • 26
    • 31344470736 scopus 로고    scopus 로고
    • Rockfeller University
    • A Glossary of Genetics, Rockfeller University, on-line at (last accessed March 2005): http://linkage.rockefeller.edu/wli/glossary/genetics.html.
    • A Glossary of Genetics
  • 27
    • 31344438452 scopus 로고    scopus 로고
    • R. Bowen, Molecular Toolkit Help, on-line at (last accessed March 2005): http://arbl.cvmbs.colostate.edu/molkit/help.html.
    • Molecular Toolkit Help
    • Bowen, R.1
  • 29
    • 31344453798 scopus 로고    scopus 로고
    • The Natural History Museum, Nature Navigator, on-line at (last accessed March 2005): http://internt.nhm.ac.uk/jdsml/naturenavigator/naturenamed/index. dsml.
    • Nature Navigator
  • 30
    • 31344459324 scopus 로고    scopus 로고
    • J.W. Kimball, Kimball's Biology Pages, on-line at (last accessed March 2005): http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/T/Taxonomy.html.
    • Kimball's Biology Pages
    • Kimball, J.W.1
  • 32
    • 0036184783 scopus 로고    scopus 로고
    • Evolution of a human immunodeficiency virus type 1 variant with enhanced replication in pig-tailed macaque cells by dna shuffling
    • K. Pekrun, R. Shibata, T. Igarashi, M. Reed, L. Sheppard, P.A. Patten, W.P.C. Stemmer, M.A. Martin, and N. Soong Evolution of a human immunodeficiency virus type 1 variant with enhanced replication in pig-tailed macaque cells by dna shuffling J. Virol. 76 2002 2924 2935
    • (2002) J. Virol. , vol.76 , pp. 2924-2935
    • Pekrun, K.1    Shibata, R.2    Igarashi, T.3    Reed, M.4    Sheppard, L.5    Patten, P.A.6    Stemmer, W.P.C.7    Martin, M.A.8    Soong, N.9
  • 33
    • 0000825481 scopus 로고
    • A statistical method for evaluating systematic relationships
    • R.R. Sokal, and C.D. Michener A statistical method for evaluating systematic relationships Univ. Kanas Sci. Bull. 28 1958 1409 1438
    • (1958) Univ. Kanas Sci. Bull. , vol.28 , pp. 1409-1438
    • Sokal, R.R.1    Michener, C.D.2
  • 35
    • 0035102453 scopus 로고    scopus 로고
    • An information based sequence distance and its application to whole mitochondrial genome phylogeny
    • M. Li, J.H. Badger, C. Xin, S. Kwong, P. Kearney, and H. Zhang An information based sequence distance and its application to whole mitochondrial genome phylogeny Bioinformatics 17 2001 149 154
    • (2001) Bioinformatics , vol.17 , pp. 149-154
    • Li, M.1    Badger, J.H.2    Xin, C.3    Kwong, S.4    Kearney, P.5    Zhang, H.6
  • 36
    • 0029060923 scopus 로고    scopus 로고
    • Dinucleotide relative abundance extremes: A genomic signature
    • S. Karlin, and C. Burge Dinucleotide relative abundance extremes: a genomic signature Trends Genet. 11 2000 283 290
    • (2000) Trends Genet. , vol.11 , pp. 283-290
    • Karlin, S.1    Burge, C.2
  • 37
    • 1242335920 scopus 로고    scopus 로고
    • Whole proteome prokaryote phylogeny without sequence alignment: A k-string composition approach
    • J. Qi, B. Wang, and B.I. Hao Whole proteome prokaryote phylogeny without sequence alignment: a k-string composition approach J. Mol. E 58 2004 2924 2935
    • (2004) J. Mol. e , vol.58 , pp. 2924-2935
    • Qi, J.1    Wang, B.2    Hao, B.I.3
  • 40
    • 4043104082 scopus 로고    scopus 로고
    • A whole genome perspective on the phylogeny of the plant virus family Tombusviridae
    • G. Stuart, K. Moffett, and R.F. Bozarth A whole genome perspective on the phylogeny of the plant virus family Tombusviridae Arch. Virol. 149 2004 1595 1610
    • (2004) Arch. Virol. , vol.149 , pp. 1595-1610
    • Stuart, G.1    Moffett, K.2    Bozarth, R.F.3
  • 42
    • 0040920369 scopus 로고    scopus 로고
    • OMIM (TM), McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD)
    • Online Mendelian Inheritance in Man, OMIM (TM), McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD), 2000. URL: http://www.ncbi.nlm.nih.gov/omim/.
    • (2000) Online Mendelian Inheritance in Man


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.