메뉴 건너뛰기




Volumn , Issue , 2007, Pages 43-52

A simple statistical algorithm for biological sequence compression

Author keywords

[No Author keywords available]

Indexed keywords

DATA STRUCTURES; DNA SEQUENCES; ENCODING (SYMBOLS); PROBABILITY DISTRIBUTIONS; PROTEINS; STATISTICAL METHODS;

EID: 34547630480     PISSN: 10680314     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/DCC.2007.7     Document Type: Conference Paper
Times cited : (163)

References (26)
  • 1
    • 34547639707 scopus 로고    scopus 로고
    • On compressibility of protein sequences
    • D. Adjeroh and F. Nan. On compressibility of protein sequences. DCC, pages 422-434, 2006.
    • (2006) DCC , pp. 422-434
    • Adjeroh, D.1    Nan, F.2
  • 2
    • 0031602647 scopus 로고    scopus 로고
    • Compression of strings with approximate repeats
    • L. Allison, T. Edgoose, and T. I. Dix. Compression of strings with approximate repeats. ISMB, pages 8-16, 1998.
    • (1998) ISMB , pp. 8-16
    • Allison, L.1    Edgoose, T.2    Dix, T.I.3
  • 3
    • 0033899414 scopus 로고    scopus 로고
    • Compression of biological sequences by greedy off-line textual substitution
    • A. Apostolico and S. Lonardi. Compression of biological sequences by greedy off-line textual substitution. DCC, pages 143-152, 2000.
    • (2000) DCC , pp. 143-152
    • Apostolico, A.1    Lonardi, S.2
  • 4
    • 26444479436 scopus 로고    scopus 로고
    • B. Behzadi and F. L. Fessant. DNA compression challenge revisited: A dynamic programming approach. CPM, pages 190-200, 2005.
    • B. Behzadi and F. L. Fessant. DNA compression challenge revisited: A dynamic programming approach. CPM, pages 190-200, 2005.
  • 5
    • 0014516784 scopus 로고
    • The information content of a multistate distribution
    • D. M. Boulton and C. S. Wallace. The information content of a multistate distribution. Theoretical Biology, 23(2):269-278, 1969.
    • (1969) Theoretical Biology , vol.23 , Issue.2 , pp. 269-278
    • Boulton, D.M.1    Wallace, C.S.2
  • 6
    • 0033691587 scopus 로고    scopus 로고
    • A compression algorithm for DNA sequences and its applications in genome comparison
    • X. Chen, S. Kwong, and M. Li. A compression algorithm for DNA sequences and its applications in genome comparison. RECOMB, page 107, 2000.
    • (2000) RECOMB , pp. 107
    • Chen, X.1    Kwong, S.2    Li, M.3
  • 7
    • 0036947893 scopus 로고    scopus 로고
    • DNACompress: Fast and effective DNA sequence compression
    • Dec
    • X. Chen, M. Li, B. Ma, and T. John. DNACompress: Fast and effective DNA sequence compression. Bioinformatics, 18(2): 1696-1698, Dec 2002.
    • (2002) Bioinformatics , vol.18 , Issue.2 , pp. 1696-1698
    • Chen, X.1    Li, M.2    Ma, B.3    John, T.4
  • 8
    • 0021405335 scopus 로고
    • Data compression using adaptive coding and partial string matching
    • April
    • J. G. Cleary and I. H. Witten. Data compression using adaptive coding and partial string matching. IEEE Trans. Comm., COM-32(4):396-402, April 1984.
    • (1984) IEEE Trans. Comm , vol.COM-32 , Issue.4 , pp. 396-402
    • Cleary, J.G.1    Witten, I.H.2
  • 10
    • 45149113022 scopus 로고    scopus 로고
    • Comparative analysis of long DNA sequences by per element information content using different contexts
    • to appear
    • T. I. Dix, D. R. Powell, L. Allison, S. Jaeger, J. Bernal, and L. Stern. Comparative analysis of long DNA sequences by per element information content using different contexts. BMC Bioinformatics, to appear, 2007.
    • (2007) BMC Bioinformatics
    • Dix, T.I.1    Powell, D.R.2    Allison, L.3    Jaeger, S.4    Bernal, J.5    Stern, L.6
  • 11
    • 84925291641 scopus 로고
    • Compression of DNA sequences
    • S. Grumbach and F. Tahi. Compression of DNA sequences. DCC, pages 340-350, 1993.
    • (1993) DCC , pp. 340-350
    • Grumbach, S.1    Tahi, F.2
  • 12
    • 0000100455 scopus 로고
    • A new challenge for compression algorithms: Genetic sequences
    • S. Grumbach and F. Tahi. A new challenge for compression algorithms: Genetic sequences. Inf. Process. Manage., 30(6):875-866, 1994.
    • (1994) Inf. Process. Manage , vol.30 , Issue.6 , pp. 875-866
    • Grumbach, S.1    Tahi, F.2
  • 13
    • 11844278466 scopus 로고    scopus 로고
    • Protein is compressible
    • A. Hategan and I. Tabus. Protein is compressible. NORSIG, pages 192-195, 2004.
    • (2004) NORSIG , pp. 192-195
    • Hategan, A.1    Tabus, I.2
  • 14
    • 13844281512 scopus 로고    scopus 로고
    • An efficient normalized maximum likelihood algorithm for DNA sequence compression
    • G. Korodi and I. Tabus. An efficient normalized maximum likelihood algorithm for DNA sequence compression. ACM Trans. Inf. Syst., 23(1):3-34, 2005.
    • (2005) ACM Trans. Inf. Syst , vol.23 , Issue.1 , pp. 3-34
    • Korodi, G.1    Tabus, I.2
  • 15
    • 0032919622 scopus 로고    scopus 로고
    • Significantly lower entropy estimates for natural DNA sequences
    • D. Loewenstern and P. N. Yianilos. Significantly lower entropy estimates for natural DNA sequences. Computational Biology, 6(1): 125-142, 1999.
    • (1999) Computational Biology , vol.6 , Issue.1 , pp. 125-142
    • Loewenstern, D.1    Yianilos, P.N.2
  • 16
  • 17
    • 34547644450 scopus 로고    scopus 로고
    • Personal communication
    • F. Nan. Personal communication, 2006.
    • (2006)
    • Nan, F.1
  • 18
    • 0032647604 scopus 로고    scopus 로고
    • Protein is incompressible
    • C. G. Nevill-Manning and I. H. Witten. Protein is incompressible. DCC, pages 257-266, 1999.
    • (1999) DCC , pp. 257-266
    • Nevill-Manning, C.G.1    Witten, I.H.2
  • 20
    • 26444510631 scopus 로고    scopus 로고
    • A guaranteed compression scheme for repetitive DNA sequences
    • E. Rivals, J.-P. Delahaye, M. Dauchet, and O. Delgrange. A guaranteed compression scheme for repetitive DNA sequences. DCC, page 453, 1996.
    • (1996) DCC , pp. 453
    • Rivals, E.1    Delahaye, J.-P.2    Dauchet, M.3    Delgrange, O.4
  • 22
    • 34547630306 scopus 로고    scopus 로고
    • DNA sequence compression using the normalized maximum likelihood model for discrete regression
    • I. Tabus, G. Korodi, and J. Rissanen. DNA sequence compression using the normalized maximum likelihood model for discrete regression. DCC, page 253, 2003.
    • (2003) DCC , pp. 253
    • Tabus, I.1    Korodi, G.2    Rissanen, J.3
  • 24
    • 0023364261 scopus 로고
    • Arithmetic coding for data compression
    • June
    • I. H. Witten, R. M. Neal, and J. G. Cleary. Arithmetic coding for data compression. Comm ACM, 30(6):520-540, June 1987.
    • (1987) Comm ACM , vol.30 , Issue.6 , pp. 520-540
    • Witten, I.H.1    Neal, R.M.2    Cleary, J.G.3
  • 25
    • 0017493286 scopus 로고
    • A universal algorithm for sequential data compression
    • May
    • J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Trans. Inf. Syst., 23(3):337-342, May 1977.
    • (1977) IEEE Trans. Inf. Syst , vol.23 , Issue.3 , pp. 337-342
    • Ziv, J.1    Lempel, A.2
  • 26
    • 0018019231 scopus 로고
    • Compression of individual sequences via variable-rate coding
    • J. Ziv and A. Lempel. Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Syst., 24(5):530-536, 1978.
    • (1978) IEEE Trans. Inf. Syst , vol.24 , Issue.5 , pp. 530-536
    • Ziv, J.1    Lempel, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.