메뉴 건너뛰기




Volumn 9, Issue , 2008, Pages

Compressing DNA sequence databases with coil

Author keywords

[No Author keywords available]

Indexed keywords

DATA-COMMUNICATION; DNA SEQUENCE DATABASE; EXPONENTIAL RATES; EXPRESSED SEQUENCE TAGS; HIGH COMPRESSION RATIO; NARROW DISTRIBUTION; PORTABLE SOFTWARE; SEQUENCE DATABASE;

EID: 45249110222     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/1471-2105-9-242     Document Type: Article
Times cited : (12)

References (38)
  • 1
    • 84874668040 scopus 로고    scopus 로고
    • ftp://ftp.ncbi.nih.gov/genbank/release.notes/gb159.release.notes
    • NCBI NCBI-GenBank Flat File Release 159 Release Notes ftp://ftp.ncbi.nih.gov/genbank/release.notes/gb159.release.notes
  • 2
    • 84874662915 scopus 로고    scopus 로고
    • GenBank passes the 100 gigabase mark
    • Benson D and Wheeler D NCBI News
    • NCBI News GenBank Passes the 100 Gigabase Mark NCBI News Benson D and Wheeler D http://www.ncbi.nlm.nih.gov/Web/Newsltr/V14N2/100gig.html
    • NCBI News
  • 4
    • 84886762610 scopus 로고    scopus 로고
    • Gailly J Adler M gzip (GNU zip) compression utility http://www.gnu.org/ software/gzip/
    • Gailly, J.1    Adler, M.2
  • 5
    • 0034578442 scopus 로고    scopus 로고
    • Biological sequence compression algorithms: December 18-19; Tokyo
    • Universal Academy Press Universal Academy Press
    • Matsumoto T Sadakane K Imai H Biological sequence compression algorithms: December 18-19; Tokyo. Universal Academy Press 2000 43-52
    • (2000) , pp. 43-52
    • Matsumoto, T.1    Sadakane, K.2    Imai, H.3
  • 6
    • 84925291641 scopus 로고
    • Compression of DNA sequences: 30 March-2 April; Snowbird, Utah
    • IEEE Computer Society Press Storer JA and Cohn M
    • Grumbach S Tahi F Compression of DNA sequences: 30 March-2 April; Snowbird, Utah. IEEE Computer Society Press Storer JA and Cohn M 1993 340-350
    • (1993) , pp. 340-350
    • Grumbach, S.1    Tahi, F.2
  • 7
    • 0000100455 scopus 로고
    • A New Challenge for Compression Algorithms - Genetic Sequences
    • Grumbach S Tahi F A New Challenge for Compression Algorithms - Genetic Sequences Inf Process Manage 1994 30 875-886
    • (1994) Inf Process Manage , vol.30 , pp. 875-886
    • Grumbach, S.1    Tahi, F.2
  • 9
    • 0036947893 scopus 로고    scopus 로고
    • DNACompress: Fast and effective DNA sequence compression
    • 12490460
    • Chen X Li M Ma B Tromp J DNACompress: Fast and effective DNA sequence compression Bioinformatics 2002 18 1696-1698 12490460
    • (2002) Bioinformatics , vol.18 , pp. 1696-1698
    • Chen, X.1    Li, M.2    Ma, B.3    Tromp, J.4
  • 10
    • 0035102453 scopus 로고    scopus 로고
    • An information-based sequence distance and its application to whole mitochondrial genome phylogeny
    • 11238070
    • Li M Badger JH Chen X Kwong S Kearney P Zhang HY An information-based sequence distance and its application to whole mitochondrial genome phylogeny Bioinformatics 2001 17 149-154 11238070
    • (2001) Bioinformatics , vol.17 , pp. 149-154
    • Li, M.1    Badger, J.H.2    Chen, X.3    Kwong, S.4    Kearney, P.5    Zhang, H.Y.6
  • 11
    • 32544454688 scopus 로고    scopus 로고
    • Application of compression-based distance measures to protein sequence classification: A methodological study
    • 16317070
    • Kocsor A Kertesz-Farkas A Kajan L Pongor S Application of compression-based distance measures to protein sequence classification: a methodological study Bioinformatics 2006 22 407-412 16317070
    • (2006) Bioinformatics , vol.22 , pp. 407-412
    • Kocsor, A.1    Kertesz-Farkas, A.2    Kajan, L.3    Pongor, S.4
  • 12
    • 0036202921 scopus 로고    scopus 로고
    • PatternHunter: Faster and more sensitive homology search
    • 11934743
    • Ma B Tromp J Li M PatternHunter: Faster and more sensitive homology search Bioinformatics 2002 18 440-445 11934743
    • (2002) Bioinformatics , vol.18 , pp. 440-445
    • Ma, B.1    Tromp, J.2    Li, M.3
  • 13
    • 0028826043 scopus 로고
    • Compression of protein-sequence databases
    • 8590180
    • Strelets VB Lim HA Compression of Protein-Sequence Databases Comput Appl Biosci 1995 11 557-561 8590180
    • (1995) Comput Appl Biosci , vol.11 , pp. 557-561
    • Strelets, V.B.1    Lim, H.A.2
  • 15
    • 84874651569 scopus 로고
    • PKZIP
    • Milwaukee, WI, USA, PKWARE, Inc. 1.1
    • Katz P PKZIP Milwaukee, WI, USA, PKWARE, Inc. 1.1 1990 http://www.pkware.com/
    • (1990)
    • Katz, P.1
  • 16
    • 0035072551 scopus 로고    scopus 로고
    • Clustering of highly homologous sequences to reduce the size of large protein databases
    • 11294794
    • Li WZ Jaroszewski L Godzik A Clustering of highly homologous sequences to reduce the size of large protein databases Bioinformatics 2001 17 282-283 11294794
    • (2001) Bioinformatics , vol.17 , pp. 282-283
    • Li, W.Z.1    Jaroszewski, L.2    Godzik, A.3
  • 17
    • 0036169928 scopus 로고    scopus 로고
    • Tolerating some redundancy significantly speeds up clustering of large protein databases
    • 11836214
    • Li WZ Jaroszewski L Godzik A Tolerating some redundancy significantly speeds up clustering of large protein databases Bioinformatics 2002 18 77-82 11836214
    • (2002) Bioinformatics , vol.18 , pp. 77-82
    • Li, W.Z.1    Jaroszewski, L.2    Godzik, A.3
  • 18
    • 33745634395 scopus 로고    scopus 로고
    • Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences
    • 16731699
    • Li WZ Godzik A Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences Bioinformatics 2006 22 1658-1659 16731699
    • (2006) Bioinformatics , vol.22 , pp. 1658-1659
    • Li, W.Z.1    Godzik, A.2
  • 19
    • 84886758809 scopus 로고    scopus 로고
    • nrdb http://blast.wustl.edu/pub/nrdb/
  • 20
    • 0027968068 scopus 로고
    • Clustal-W - Improving the Sensitivity of Progressive Multiple Sequence Alignment through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice
    • 308517 7984417
    • Thompson JD Higgins DG Gibson TJ Clustal-W - Improving the Sensitivity of Progressive Multiple Sequence Alignment through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice Nucleic Acids Res 1994 22 4673-4680 308517 7984417
    • (1994) Nucleic Acids Res , vol.22 , pp. 4673-4680
    • Thompson, J.D.1    Higgins, D.G.2    Gibson, T.J.3
  • 22
    • 0012574730 scopus 로고    scopus 로고
    • A minimum spanning tree algorithm with Inverse Ackermann type complexity
    • Chazelle B A minimum spanning tree algorithm with Inverse Ackermann type complexity Journal of the ACM 2000 47 1028-1047
    • (2000) Journal of the ACM , vol.47 , pp. 1028-1047
    • Chazelle, B.1
  • 23
    • 30544432152 scopus 로고    scopus 로고
    • Indexing compressed text
    • Ferragina P Manzini G Indexing compressed text J ACM 2005 52 552-581
    • (2005) J ACM , vol.52 , pp. 552-581
    • Ferragina, P.1    Manzini, G.2
  • 24
    • 33750283805 scopus 로고    scopus 로고
    • A compressed self-index using a Ziv-Lempel dictionary
    • Berlin, SPRINGER-VERLAG BERLIN
    • Russo LMS Oliveira AL A compressed self-index using a Ziv-Lempel dictionary String Processing and Information Retrieval, Proceedings Berlin, SPRINGER-VERLAG BERLIN Lecture Notes in Computer Science 2006 4209 163-180
    • (2006) String Processing and Information Retrieval, Proceedings , vol.4209 , pp. 163-180
    • Russo, L.M.S.1    Oliveira, A.L.2
  • 25
    • 33846576963 scopus 로고    scopus 로고
    • When indexing equals compression: Experiments with compressing suffix arrays and applications
    • Foschini L Grossi R Gupta A Vitter JS When indexing equals compression: Experiments with compressing suffix arrays and applications ACM Trans Algorithms 2006 2 611-639
    • (2006) ACM Trans Algorithms , vol.2 , pp. 611-639
    • Foschini, L.1    Grossi, R.2    Gupta, A.3    Vitter, J.S.4
  • 26
    • 0021919480 scopus 로고
    • Rapid and sensitive protein similarity searches
    • 2983426
    • Lipman DJ Pearson WR Rapid and Sensitive Protein Similarity Searches Science 1985 227 1435-1441 2983426
    • (1985) Science , vol.227 , pp. 1435-1441
    • Lipman, D.J.1    Pearson, W.R.2
  • 27
    • 77954935024 scopus 로고    scopus 로고
    • bzip2 and libbzip2 - A program and library for data compression
    • 1.0.3
    • Seward J bzip2 and libbzip2 - A program and library for data compression 1.0.3 1997
    • (1997)
    • Seward, J.1
  • 28
    • 0017492836 scopus 로고
    • A fast algorithm for computing longest common subsequences
    • Hunt JW Szymanski TG A Fast Algorithm for Computing Longest Common Subsequences Communications of the ACM 1977 20 350-353
    • (1977) Communications of the ACM , vol.20 , pp. 350-353
    • Hunt, J.W.1    Szymanski, T.G.2
  • 29
    • 0034764307 scopus 로고    scopus 로고
    • SSAHA: A fast search method for large DNA databases
    • 311141 11591649
    • Ning ZM Cox AJ Mullikin JC SSAHA: A fast search method for large DNA databases Genome Res 2001 11 1725-1729 311141 11591649
    • (2001) Genome Res , vol.11 , pp. 1725-1729
    • Ning, Z.M.1    Cox, A.J.2    Mullikin, J.C.3
  • 30
    • 45249103790 scopus 로고    scopus 로고
    • One-gapped q-gram filters for Levenshtein distance
    • Berlin, SPRINGER-VERLAG BERLIN Lecture Notes in Computer Science
    • Burkhardt S Karkkainen J One-gapped q-gram filters for Levenshtein distance Combinatorial Pattern Matching Berlin, SPRINGER-VERLAG BERLIN Lecture Notes in Computer Science 2002 2373 225-234
    • (2002) Combinatorial Pattern Matching , vol.2373 , pp. 225-234
    • Burkhardt, S.1    Karkkainen, J.2
  • 31
    • 70350674995 scopus 로고
    • On the shortest spanning subtree of a graph and the traveling salesman problem
    • Kruskal JB Jr. On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem Proceedings of the American Mathematical Society 1956 7 48-50
    • (1956) Proceedings of the American Mathematical Society , vol.7 , pp. 48-50
    • Kruskal Jr., J.B.1
  • 32
    • 84911584312 scopus 로고
    • Shortest connection networks and some generalizations
    • Prim RC Shortest Connection Networks and Some Generalizations Bell System Technical Journal 1957 36 1389-1401
    • (1957) Bell System Technical Journal , vol.36 , pp. 1389-1401
    • Prim, R.C.1
  • 33
    • 0003827295 scopus 로고
    • Algorithms from P to NP: Design and efficiency
    • Redwood City, CA, Benjamin/Cummings
    • Moret B Shapiro H Algorithms from P to NP: Design and Efficiency Redwood City, CA, Benjamin/Cummings 1991
    • (1991)
    • Moret, B.1    Shapiro, H.2
  • 34
    • 0016495233 scopus 로고
    • Efficiency of a good but not linear set union algorithm
    • Tarjan RE Efficiency of a Good but Not Linear Set Union Algorithm J ACM 1975 22 215-225
    • (1975) J ACM , vol.22 , pp. 215-225
    • Tarjan, R.E.1
  • 35
    • 33745128489 scopus 로고
    • An O(ND) difference algorithm and its variations
    • Myers EW An O(ND) Difference Algorithm and its Variations Algorithmica 1986 1 251-266
    • (1986) Algorithmica , vol.1 , pp. 251-266
    • Myers, E.W.1
  • 36
    • 84886755999 scopus 로고    scopus 로고
    • GenBank Sequence Database http://www.ncbi.nlm.nih.gov/Genbank/index.html
  • 38
    • 84886764769 scopus 로고    scopus 로고
    • 7-Zip http://www.7-zip.org


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.