메뉴 건너뛰기




Volumn 25, Issue 13, 2009, Pages 1575-1586

Textual data compression in computational biology: A synopsis

Author keywords

[No Author keywords available]

Indexed keywords

DNA; MICRORNA;

EID: 67649170975     PISSN: 13674803     EISSN: 14602059     Source Type: Journal    
DOI: 10.1093/bioinformatics/btp117     Document Type: Review
Times cited : (81)

References (167)
  • 1
    • 3042683172 scopus 로고    scopus 로고
    • Information theory in molecular biology
    • Adami,C. (2004) Information theory in molecular biology. Phys. Life Rev., 1, 3-22.
    • (2004) Phys. Life Rev , vol.1 , pp. 3-22
    • Adami, C.1
  • 4
    • 38749149558 scopus 로고    scopus 로고
    • Identifying statistical dependence in genomic sequences via mutual information estimates
    • Aktulga,H.M. et al. (2007) Identifying statistical dependence in genomic sequences via mutual information estimates. EURASIP J. Bioinform. Syst. Biol., 2007, 1-11.
    • (2007) EURASIP J. Bioinform. Syst. Biol , vol.2007 , pp. 1-11
    • Aktulga, H.M.1
  • 5
    • 0025116731 scopus 로고
    • Minimum message length encoding and the comparison of macromolecules
    • Allison,L. andYee,C.N. (1990) Minimum message length encoding and the comparison of macromolecules. Bull. Math. Biol., 52, 431-453.
    • (1990) Bull. Math. Biol , vol.52 , pp. 431-453
    • Allison, L.1    andYee, C.N.2
  • 6
    • 0033630766 scopus 로고
    • Sequence complexity for biological sequence analysis
    • Allison,L. et al. (1992) Sequence complexity for biological sequence analysis. Comput. Chem., 24, 43-55.
    • (1992) Comput. Chem , vol.24 , pp. 43-55
    • Allison, L.1
  • 8
    • 0025183708 scopus 로고
    • Basic local alignment search tool
    • Altshul,S.F. et al. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403-410.
    • (1990) J. Mol. Biol , vol.215 , pp. 403-410
    • Altshul, S.F.1
  • 9
    • 0041664867 scopus 로고    scopus 로고
    • Finding haplotype block boundaries by using the minimum-description-length principle
    • Anderson,E.C. and Novembre,J. (2003) Finding haplotype block boundaries by using the minimum-description-length principle. Am. J. Hum. Genet. 73, 336-354.
    • (2003) Am. J. Hum. Genet , vol.73 , pp. 336-354
    • Anderson, E.C.1    Novembre, J.2
  • 13
    • 34248385154 scopus 로고    scopus 로고
    • Mining, compressing and classifying with extensible motifs
    • Apostolico,A. et al. (2006) Mining, compressing and classifying with extensible motifs. Alg. Mol. Biol., 1, 4.
    • (2006) Alg. Mol. Biol , vol.1 , pp. 4
    • Apostolico, A.1
  • 15
    • 67649141138 scopus 로고    scopus 로고
    • A DNA sequence compression algorithm based on LUT and LZ77, abs/cs/0504100
    • Bao,S. et al. (2005) A DNA sequence compression algorithm based on LUT and LZ77. CoRR, abs/cs/0504100.
    • (2005) CoRR
    • Bao, S.1
  • 16
    • 0032183995 scopus 로고    scopus 로고
    • The minimum description length principle in coding and modeling
    • Barron,A.R. et al. (1998) The minimum description length principle in coding and modeling. IEEE Trans. Inform. Theory, 44, 2743-2760.
    • (1998) IEEE Trans. Inform. Theory , vol.44 , pp. 2743-2760
    • Barron, A.R.1
  • 17
    • 16844376909 scopus 로고    scopus 로고
    • Reverse engineering of regulatory networks in human B cells
    • Basso,K. et al. (2003) Reverse engineering of regulatory networks in human B cells. Nat. Genet., 37, 382-390.
    • (2003) Nat. Genet , vol.37 , pp. 382-390
    • Basso, K.1
  • 18
    • 26444479436 scopus 로고    scopus 로고
    • Behzadi,B. and Fessant,F.L. (2005) DNA compression challenge revisited: a dynamic programming approach. In CPM, Springer, pp. 190-200.
    • Behzadi,B. and Fessant,F.L. (2005) DNA compression challenge revisited: a dynamic programming approach. In CPM, Springer, pp. 190-200.
  • 19
    • 0035109647 scopus 로고    scopus 로고
    • Variations on probabilistic suffix trees: Statistical modeling and prediction of protein families
    • Bejerano,G. and Yona,G. (2001) Variations on probabilistic suffix trees: statistical modeling and prediction of protein families. Bioinformatics, 17, 23-43.
    • (2001) Bioinformatics , vol.17 , pp. 23-43
    • Bejerano, G.1    Yona, G.2
  • 20
    • 4944246972 scopus 로고    scopus 로고
    • Dynamical systems and computable information
    • Benci,V. et al. (2004) Dynamical systems and computable information. Discrete Contin. Dyna. Syst. B, 4, 935-960.
    • (2004) Discrete Contin. Dyna. Syst. B , vol.4 , pp. 935-960
    • Benci, V.1
  • 21
    • 53649092768 scopus 로고    scopus 로고
    • Compressing proteomes: The relevance of medium range correlations
    • Benedetto,D. et al. (2007) Compressing proteomes: The relevance of medium range correlations. EURASIP J. Bioinform. Syst. Biol., 2007, 1-8.
    • (2007) EURASIP J. Bioinform. Syst. Biol , vol.2007 , pp. 1-8
    • Benedetto, D.1
  • 22
    • 4243764255 scopus 로고    scopus 로고
    • Compositional segmentation and long-range fractal correlations in DNA sequences
    • Bernaola-Galván,P. et al. (1996) Compositional segmentation and long-range fractal correlations in DNA sequences. Phys. Rev. E, 53, 5181-5189.
    • (1996) Phys. Rev. E , vol.53 , pp. 5181-5189
    • Bernaola-Galván, P.1
  • 23
    • 0000460109 scopus 로고    scopus 로고
    • Decomposition of DNA sequence complexity
    • Bernaola-Galván,P. et al. (1999) Decomposition of DNA sequence complexity. Phys. Rev. Lett., 83, 3336-3339.
    • (1999) Phys. Rev. Lett , vol.83 , pp. 3336-3339
    • Bernaola-Galván, P.1
  • 24
    • 0034238085 scopus 로고    scopus 로고
    • Finding borders between coding and noncoding DNA regions by an entropic segmentation method
    • Bernaola-Galván,P. et al. (2000) Finding borders between coding and noncoding DNA regions by an entropic segmentation method. Phys. Rev. Lett., 85, 1342-1345.
    • (2000) Phys. Rev. Lett , vol.85 , pp. 1342-1345
    • Bernaola-Galván, P.1
  • 25
    • 0023472826 scopus 로고
    • GpC-rich islands as gene markers in the vertebrate nucleus
    • Bird,A.P. (1987) GpC-rich islands as gene markers in the vertebrate nucleus. Trends Genet., 3, 342-347.
    • (1987) Trends Genet , vol.3 , pp. 342-347
    • Bird, A.P.1
  • 27
    • 2642530436 scopus 로고    scopus 로고
    • DNA sequence analysis linguistic tools: Contrast vocabularies, compositional spectra and linguistic complexity
    • Bolshoy,A. (2003) DNA sequence analysis linguistic tools: Contrast vocabularies, compositional spectra and linguistic complexity. Appl. Bioinform., 2, 103-112.
    • (2003) Appl. Bioinform , vol.2 , pp. 103-112
    • Bolshoy, A.1
  • 29
    • 84988951426 scopus 로고    scopus 로고
    • Algorithmic aspects in speech recognition: An introduction
    • Buchsbaum,A.L. and Giancarlo,R. (1997) Algorithmic aspects in speech recognition: An introduction. ACM J. Exp. Alg., 2, 1.
    • (1997) ACM J. Exp. Alg , vol.2 , pp. 1
    • Buchsbaum, A.L.1    Giancarlo, R.2
  • 30
    • 0033906346 scopus 로고    scopus 로고
    • Engineering the compression of massive tables: An experimental approach
    • ACM-SIAM, pp
    • Buchsbaum,A.L. et al. (2000) Engineering the compression of massive tables: An experimental approach. In SODA 00: Proceedings of the Symposium on Discrete Algorithms. ACM-SIAM, pp. 175-184.
    • (2000) SODA 00: Proceedings of the Symposium on Discrete Algorithms , pp. 175-184
    • Buchsbaum, A.L.1
  • 31
    • 4243175869 scopus 로고    scopus 로고
    • Improving table compression with combinatorial optimization
    • Buchsbaum,A.L. et al. (2003) Improving table compression with combinatorial optimization. J. ACM, 50, 825-851.
    • (2003) J. ACM , vol.50 , pp. 825-851
    • Buchsbaum, A.L.1
  • 32
    • 0003573193 scopus 로고
    • A block-sorting lossless data compression algorithm
    • Technical Report 124, Digital Equipment Corporation
    • Burrows,M. and Wheeler,D. (1994) A block-sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation.
    • (1994)
    • Burrows, M.1    Wheeler, D.2
  • 33
    • 0033258311 scopus 로고    scopus 로고
    • Unsupervised knowledge discovery in medical databases using relevance networks
    • Hanley and Belfus, pp
    • Butte,A.J. and Kohane,I.S. (1999) Unsupervised knowledge discovery in medical databases using relevance networks. In Proceedings of the AMIA Symposium. Hanley and Belfus, pp. 711-715.
    • (1999) Proceedings of the AMIA Symposium , pp. 711-715
    • Butte, A.J.1    Kohane, I.S.2
  • 34
    • 0033655775 scopus 로고    scopus 로고
    • Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements
    • World Scientific, pp
    • Butte,A.J. and Kohane,I.S. (2000) Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In Proceedings of the Pacific Symposium on Biocomputing (PSB). World Scientific, pp. 415-426.
    • (2000) Proceedings of the Pacific Symposium on Biocomputing (PSB) , pp. 415-426
    • Butte, A.J.1    Kohane, I.S.2
  • 35
    • 0034710924 scopus 로고    scopus 로고
    • Discovering functional relationships between RNA expression and Chemotherapeutic susceptibility using relevance networks
    • Butte,A.J. et al. (2000) Discovering functional relationships between RNA expression and Chemotherapeutic susceptibility using relevance networks. In Proc. Natl Acad. Sci. USA, 12182-12186.
    • (2000) Proc. Natl Acad. Sci. USA , pp. 12182-12186
    • Butte, A.J.1
  • 36
    • 34547630480 scopus 로고    scopus 로고
    • A simple statistical algorithm for biological sequence compression
    • IEEE Computer Society, pp
    • Cao,M.D. et al. (2007) A simple statistical algorithm for biological sequence compression. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, pp. 43-52.
    • (2007) Proceedings of the IEEE Data Compression Conference (DCC) , pp. 43-52
    • Cao, M.D.1
  • 37
    • 1942489258 scopus 로고    scopus 로고
    • Informational complexity and functional activity of RNA structures
    • Carothers,J. et al. (2004) Informational complexity and functional activity of RNA structures. J. Am. Chem. Soc., 126, 5130-5137.
    • (2004) J. Am. Chem. Soc , vol.126 , pp. 5130-5137
    • Carothers, J.1
  • 39
    • 0036947893 scopus 로고    scopus 로고
    • DNACompress: Fast and effective DNA sequence compression
    • Chen,X. et al. (2002) DNACompress: Fast and effective DNA sequence compression. Bioinformatics, 18, 1696-1698.
    • (2002) Bioinformatics , vol.18 , pp. 1696-1698
    • Chen, X.1
  • 41
    • 34547666722 scopus 로고    scopus 로고
    • Biological networks: Comparison, conservation, and evolutionary via relative description length
    • Chor,B. and Tuller,T. (2007) Biological networks: Comparison, conservation, and evolutionary via relative description length. J. Comput. Biol., 14, 817-834.
    • (2007) J. Comput. Biol , vol.14 , pp. 817-834
    • Chor, B.1    Tuller, T.2
  • 44
    • 0033563426 scopus 로고    scopus 로고
    • Zones of low entropy in genomic sequence
    • Crochemore,M. and Vérin,R. (1999) Zones of low entropy in genomic sequence. Comput. Chem., 23, 275-282.
    • (1999) Comput. Chem , vol.23 , pp. 275-282
    • Crochemore, M.1    Vérin, R.2
  • 45
    • 0942266549 scopus 로고    scopus 로고
    • A sub-quadratic sequence alignment algorithm for unrestricted cost matrices
    • Crochemore,M. et al. (2003) A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. SIAM J. Comput., 32 1654-1673.
    • (2003) SIAM J. Comput , vol.32 , pp. 1654-1673
    • Crochemore, M.1
  • 46
    • 0034791035 scopus 로고    scopus 로고
    • High-resolution haplotype structure in the human genome
    • Daly,M.J. et al. (2001) High-resolution haplotype structure in the human genome. Nat. Genet., 29, 229-232.
    • (2001) Nat. Genet , vol.29 , pp. 229-232
    • Daly, M.J.1
  • 47
    • 45149113022 scopus 로고    scopus 로고
    • Comparative analysis of long DNA sequences by per element information content using different contexts
    • Dix,T.I. et al. (2007) Comparative analysis of long DNA sequences by per element information content using different contexts. BMC Bioinformatics, 8(Suppl. 2), s10.
    • (2007) BMC Bioinformatics , vol.8 , Issue.SUPPL. 2
    • Dix, T.I.1
  • 49
    • 53649106963 scopus 로고    scopus 로고
    • MicroRNA target detection and analysis for genes related to breast cancer using MDLcompress
    • Evans,S.C. et al. (2007) MicroRNA target detection and analysis for genes related to breast cancer using MDLcompress. EURASIP J. Bioinform. Syst. Biol., 2007, 1-16.
    • (2007) EURASIP J. Bioinform. Syst. Biol , vol.2007 , pp. 1-16
    • Evans, S.C.1
  • 50
    • 84994364597 scopus 로고
    • On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence
    • ACM-SIAM, pp
    • Farach,M. et al. (1995) On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence. In SODA 95: Proceedings of the Symposium on Discrete Algorithms. ACM-SIAM, pp. 48-57.
    • (1995) SODA 95: Proceedings of the Symposium on Discrete Algorithms , pp. 48-57
    • Farach, M.1
  • 51
    • 34547753523 scopus 로고    scopus 로고
    • Compression-based classification of biological sequences and structures via the Universal Similarity Metric: Experimental assessment
    • Ferragina,P. et al. (2007) Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment. BMC Bioinformatis, 8, 252.
    • (2007) BMC Bioinformatis , vol.8 , pp. 252
    • Ferragina, P.1
  • 52
    • 84979917497 scopus 로고    scopus 로고
    • Compressed text indexes: From theory to practice
    • Ferragina,P. et al. (2008) Compressed text indexes: From theory to practice. ACM J. Exp. Alg., 13.
    • (2008) ACM J. Exp. Alg , vol.13
    • Ferragina, P.1
  • 54
    • 18444369013 scopus 로고    scopus 로고
    • The structure of haplotype blocks in the human genome
    • Gabriel,S. et al. (2002) The structure of haplotype blocks in the human genome. Science, 26, 2225-2229.
    • (2002) Science , vol.26 , pp. 2225-2229
    • Gabriel, S.1
  • 55
    • 67649180696 scopus 로고    scopus 로고
    • Set-based complexity and biological information, abs/0801.4024
    • Galas,D.J. et al. (2008) Set-based complexity and biological information. CoRR, abs/0801.4024.
    • (2008) CoRR
    • Galas, D.J.1
  • 57
    • 0242405250 scopus 로고    scopus 로고
    • Dynamic programming: Special cases
    • Apostolico,A. and Galil,Z, eds, Oxford University Press, pp
    • Giancarlo,R. (1997). Dynamic programming: Special cases. In Apostolico,A. and Galil,Z. (eds), Pattern Matching Algorithms. Oxford University Press, pp. 201-236.
    • (1997) Pattern Matching Algorithms , pp. 201-236
    • Giancarlo, R.1
  • 58
    • 67649186179 scopus 로고    scopus 로고
    • Alignment-free comparison of TOPS strings
    • College Publications, pp
    • Gilbert,D. et al. (2007) Alignment-free comparison of TOPS strings. In Proceedings of London Algorithmics and Stringology. College Publications, pp. 177-197.
    • (2007) Proceedings of London Algorithmics and Stringology , pp. 177-197
    • Gilbert, D.1
  • 62
    • 0000100455 scopus 로고
    • A new challenge for compression algorithms: Genetic sequences
    • Grümbach,S. and Tahi,F. (1994) A new challenge for compression algorithms: Genetic sequences. Inform. Process. Manage., 30, 875-886.
    • (1994) Inform. Process. Manage , vol.30 , pp. 875-886
    • Grümbach, S.1    Tahi, F.2
  • 65
    • 0026466830 scopus 로고
    • Identifying constraints on the higher-order structure of RNA: Continued development and application of comparative sequence analysis methods
    • Gutell,R.R. et al. (1992) Identifying constraints on the higher-order structure of RNA: Continued development and application of comparative sequence analysis methods. Nucleic Acids Res., 20 5785-5795.
    • (1992) Nucleic Acids Res , vol.20 , pp. 5785-5795
    • Gutell, R.R.1
  • 66
    • 34347390455 scopus 로고    scopus 로고
    • Comparing segmentations by applying randomization techniques
    • Haiminen,N. et al. (2007) Comparing segmentations by applying randomization techniques. BMC Bioinformatics, 7, 171.
    • (2007) BMC Bioinformatics , vol.7 , pp. 171
    • Haiminen, N.1
  • 67
    • 25144456056 scopus 로고    scopus 로고
    • Computational cluster validation in post-genomic data analysis
    • Handl,J. et al. (2005) Computational cluster validation in post-genomic data analysis. Bioinformatics, 21, 3201-3212.
    • (2005) Bioinformatics , vol.21 , pp. 3201-3212
    • Handl, J.1
  • 68
    • 22844441552 scopus 로고    scopus 로고
    • Reverse engineering gene regulatory networks
    • Hartemink,A. (2005) Reverse engineering gene regulatory networks. Nat. Biotechnol., 23, 554-556.
    • (2005) Nat. Biotechnol , vol.23 , pp. 554-556
    • Hartemink, A.1
  • 70
    • 0142028977 scopus 로고    scopus 로고
    • Annotating large genomes with exact word matches
    • Healy,J. et al. (2003) Annotating large genomes with exact word matches. Genome Res., 13, 2306-2315.
    • (2003) Genome Res , vol.13 , pp. 2306-2315
    • Healy, J.1
  • 71
    • 38049051093 scopus 로고    scopus 로고
    • Advances in Intelligent Data Analysis VII (IDA 2007
    • Recurrent predictive models for sequence segmentation, of, Springer, Berlin, pp
    • Hyvonen,S. et al. (2007) Recurrent predictive models for sequence segmentation. In Advances in Intelligent Data Analysis VII (IDA 2007 Vol. 4723 of LNCS. Springer, Berlin, pp. 195-206.
    • (2007) LNCS , vol.4723 , pp. 195-206
    • Hyvonen, S.1
  • 72
    • 0030670589 scopus 로고    scopus 로고
    • Efficient discovery of conserved patterns using a pattern graph
    • Jonassen,I. (1997) Efficient discovery of conserved patterns using a pattern graph. Comput. Appl. Biosci., 13, 509-522.
    • (1997) Comput. Appl. Biosci , vol.13 , pp. 509-522
    • Jonassen, I.1
  • 74
    • 32544454688 scopus 로고    scopus 로고
    • Application of compression-based distance measures to protein sequence classification: A methodological study
    • Kocsor,A. et al. (2005) Application of compression-based distance measures to protein sequence classification: A methodological study. Bioinformatics, 22, 407-412.
    • (2005) Bioinformatics , vol.22 , pp. 407-412
    • Kocsor, A.1
  • 75
    • 0041989761 scopus 로고    scopus 로고
    • An MDL method for finding haplotype blocks and for estimating the strength of Haplotype block boundaries
    • World Scientific, pp
    • Koivisto,M. (2003)An MDL method for finding haplotype blocks and for estimating the strength of Haplotype block boundaries. In Proceedings of the Pacific Symposium on Biocomputing (PSB). World Scientific, pp. 502-513.
    • (2003) Proceedings of the Pacific Symposium on Biocomputing (PSB) , pp. 502-513
    • Koivisto, M.1
  • 76
    • 37549034468 scopus 로고    scopus 로고
    • Information theories in molecular biology and genomics
    • Konopka,A.K. (2005) Information theories in molecular biology and genomics. Nat. Encyclopedia Hum. Genome, 3, 464-469.
    • (2005) Nat. Encyclopedia Hum. Genome , vol.3 , pp. 464-469
    • Konopka, A.K.1
  • 77
    • 13844281512 scopus 로고    scopus 로고
    • An efficient normalized maximum likelihood algorithm for DNA sequence compression
    • Korodi,G. and Tabus,I. (2005) An efficient normalized maximum likelihood algorithm for DNA sequence compression. ACM Trans. Inform. Syst., 23, 3-34.
    • (2005) ACM Trans. Inform. Syst , vol.23 , pp. 3-34
    • Korodi, G.1    Tabus, I.2
  • 79
    • 2442662802 scopus 로고    scopus 로고
    • Measuring the similarity of protein structures by means of the Universal Similarity Metric
    • Krasnogor,N. and Pelta,D.A. (2004) Measuring the similarity of protein structures by means of the Universal Similarity Metric. Bioinformatics, 20, 1015-1021.
    • (2004) Bioinformatics , vol.20 , pp. 1015-1021
    • Krasnogor, N.1    Pelta, D.A.2
  • 80
    • 1542268964 scopus 로고    scopus 로고
    • Study of DNA binding sites using the Rényi parametric entropy measure
    • Krishnamachari,A. et al. (2004) Study of DNA binding sites using the Rényi parametric entropy measure. J. Theor. Biol., 227 429-436.
    • (2004) J. Theor. Biol , vol.227 , pp. 429-436
    • Krishnamachari, A.1
  • 83
    • 0016880887 scopus 로고
    • On the complexity of finite sequences
    • Lempel,A. and Ziv,J. (1976) On the complexity of finite sequences. IEEE Trans. Inform. Theory, 22, 75-81.
    • (1976) IEEE Trans. Inform. Theory , vol.22 , pp. 75-81
    • Lempel, A.1    Ziv, J.2
  • 85
    • 0035102453 scopus 로고    scopus 로고
    • An Information-based sequence distance and its application to whole mitochondrial genome phylogeny
    • Li,M. et al. (2001) An Information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics 17, 149-154.
    • (2001) Bioinformatics , vol.17 , pp. 149-154
    • Li, M.1
  • 86
    • 10644294829 scopus 로고    scopus 로고
    • The similarity metric
    • Li,M. et al. (2003) The similarity metric. IEEE Trans. Inform. Theory, 50, 3250-3264.
    • (2003) IEEE Trans. Inform. Theory , vol.50 , pp. 3250-3264
    • Li, M.1
  • 87
    • 67349186481 scopus 로고    scopus 로고
    • Lifshits,Y. et al. (2008) Speeding upHMMdecoding and training by exploiting sequence repetitions. Algorithmica [doi 10.1007/s00453-007-9128-0].
    • Lifshits,Y. et al. (2008) Speeding upHMMdecoding and training by exploiting sequence repetitions. Algorithmica [doi 10.1007/s00453-007-9128-0].
  • 88
    • 0030596404 scopus 로고    scopus 로고
    • High statistics block entropy measures of DNA sequences
    • Lió,P. et al. (1996) High statistics block entropy measures of DNA sequences. J. Theor. Biol., 180, 151-160.
    • (1996) J. Theor. Biol , vol.180 , pp. 151-160
    • Lió, P.1
  • 89
    • 18844405663 scopus 로고    scopus 로고
    • Space-efficient whole genome comparisons with Burrows-Wheeler Transforms
    • Lippert,R.A. (2005) Space-efficient whole genome comparisons with Burrows-Wheeler Transforms. J. Comput. Biol., 12, 407-415.
    • (2005) J. Comput. Biol , vol.12 , pp. 407-415
    • Lippert, R.A.1
  • 90
    • 25644453578 scopus 로고    scopus 로고
    • A space-efficient construction of the Burrows-Wheeler transform for genomic data
    • Lippert,R.A. et al. (2005) A space-efficient construction of the Burrows-Wheeler transform for genomic data. J. Comput. Biol., 12, 943-951.
    • (2005) J. Comput. Biol , vol.12 , pp. 943-951
    • Lippert, R.A.1
  • 91
    • 39149105621 scopus 로고    scopus 로고
    • Comparison of TOPS strings based on LZ complexity
    • Liu,L. and Wang,T. (2008) Comparison of TOPS strings based on LZ complexity. J. Theor. Biol., 251, 159-166.
    • (2008) J. Theor. Biol , vol.251 , pp. 159-166
    • Liu, L.1    Wang, T.2
  • 92
    • 42549148061 scopus 로고    scopus 로고
    • RNACompress: Grammar-based compression and informational complexity measurement of RNA secondary structure
    • Liu,Q. et al. (2008) RNACompress: Grammar-based compression and informational complexity measurement of RNA secondary structure. BMC Bioinformatics, 9, 176+.
    • (2008) BMC Bioinformatics , vol.9
    • Liu, Q.1
  • 93
    • 0032919622 scopus 로고    scopus 로고
    • Significantly lower entropy estimates for natural DNA sequences
    • Loewenstern,D. and Yianilos,P.N. (1999) Significantly lower entropy estimates for natural DNA sequences. J. Comput. Biol., 6, 125-142.
    • (1999) J. Comput. Biol , vol.6 , pp. 125-142
    • Loewenstern, D.1    Yianilos, P.N.2
  • 94
    • 0012490449 scopus 로고
    • DNA sequence classification using compression-based induction
    • Technical report, DIMACS
    • Loewenstern,D. et al. (1995) DNA sequence classification using compression-based induction. Technical report, DIMACS.
    • (1995)
    • Loewenstern, D.1
  • 95
    • 24344458137 scopus 로고    scopus 로고
    • Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy
    • Long,F. and Ding,C. (2005) Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 27, 1226-1238.
    • (2005) IEEE Trans. Pattern Anal. Mach. Intell , vol.27 , pp. 1226-1238
    • Long, F.1    Ding, C.2
  • 97
    • 52449094233 scopus 로고    scopus 로고
    • Short tandem repeats in human exons: A target for disease mutations
    • Madsen,B.E. et al. (2008) Short tandem repeats in human exons: A target for disease mutations. BMC Genomics, 9, 410+.
    • (2008) BMC Genomics , vol.9
    • Madsen, B.E.1
  • 98
    • 8344261403 scopus 로고    scopus 로고
    • A simple and fast DNA compressor
    • Manzini,G. and Rastero,M. (2005) A simple and fast DNA compressor. Softw. Pract. Exper., 35, 1397-1411.
    • (2005) Softw. Pract. Exper , vol.35 , pp. 1397-1411
    • Manzini, G.1    Rastero, M.2
  • 100
    • 33947305781 scopus 로고    scopus 로고
    • ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context
    • Margolin,A.A. et al. (2006a). ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics, 7(Suppl. 1), s7.
    • (2006) BMC Bioinformatics , vol.7 , Issue.SUPPL. 1
    • Margolin, A.A.1
  • 101
    • 33750353888 scopus 로고    scopus 로고
    • Reverse engineering cellular networks
    • Margolin,A.A. et al. (2006b) Reverse engineering cellular networks. Nat. Protocols, 1, 663-672.
    • (2006) Nat. Protocols , vol.1 , pp. 663-672
    • Margolin, A.A.1
  • 102
    • 0034578442 scopus 로고    scopus 로고
    • Biological sequence compression algorithms
    • Matsumoto,T. et al. (2000) Biological sequence compression algorithms. Genome Inform., 11, 43-52.
    • (2000) Genome Inform , vol.11 , pp. 43-52
    • Matsumoto, T.1
  • 103
    • 19044363196 scopus 로고    scopus 로고
    • Sublinear growth of information in DNA sequences
    • Menconi,G. (2004) Sublinear growth of information in DNA sequences. Bull. Math. Biol., 67, 737-759.
    • (2004) Bull. Math. Biol , vol.67 , pp. 737-759
    • Menconi, G.1
  • 104
    • 39049189269 scopus 로고    scopus 로고
    • A compression-based approach for coding sequences identifications in Prokaryotic Genomes
    • Menconi,G. and Marangoni,R. (2006). A compression-based approach for coding sequences identifications in Prokaryotic Genomes. J. Comput. Biol., 13, 1477-1488.
    • (2006) J. Comput. Biol , vol.13 , pp. 1477-1488
    • Menconi, G.1    Marangoni, R.2
  • 105
    • 36248999573 scopus 로고    scopus 로고
    • Information-Theoretic inference of large transcriptional regulatory networks
    • Meyer,P.E. et al. (2007) Information-Theoretic inference of large transcriptional regulatory networks. EURASIP J. Bioinform. Syst. Biol., 2007, 8.
    • (2007) EURASIP J. Bioinform. Syst. Biol , vol.2007 , pp. 8
    • Meyer, P.E.1
  • 106
    • 0011908356 scopus 로고
    • Discovering dependencies via algorithmic mutual information: A case study in DNA sequence comparisons
    • Milosavljevic,A. (1995) Discovering dependencies via algorithmic mutual information: A case study in DNA sequence comparisons. Mach. Learn. 21, 35-50.
    • (1995) Mach. Learn , vol.21 , pp. 35-50
    • Milosavljevic, A.1
  • 107
    • 0027194328 scopus 로고
    • Discovering simple DNA sequences by the algorithmic significance method
    • Milosavljevic,A. and Jurka,J. (1993) Discovering simple DNA sequences by the algorithmic significance method. Comput. Appli. Biosci., 9 407-411.
    • (1993) Comput. Appli. Biosci , vol.9 , pp. 407-411
    • Milosavljevic, A.1    Jurka, J.2
  • 109
    • 67649138818 scopus 로고    scopus 로고
    • Nature-Review (2008) Nature Reviews collection on microRNAs. Nat. Rev. [Epub ahead of print, doi:10.1038/nrg2202].
    • Nature-Review (2008) Nature Reviews collection on microRNAs. Nat. Rev. [Epub ahead of print, doi:10.1038/nrg2202].
  • 110
  • 111
    • 0000523223 scopus 로고    scopus 로고
    • Compression and explanation using hierarchical grammars
    • Nevill-Manning,C.G. and Witten,I.H. (1997) Compression and explanation using hierarchical grammars. Comput. J., 40, 103-116.
    • (1997) Comput. J , vol.40 , pp. 103-116
    • Nevill-Manning, C.G.1    Witten, I.H.2
  • 115
    • 0037248694 scopus 로고    scopus 로고
    • Adivide-and-conquer approach to fragment assembly
    • Otu,H.H. and Sayood,K. (2003a)Adivide-and-conquer approach to fragment assembly. Bioinformatics, 19, 22-29.
    • (2003) Bioinformatics , vol.19 , pp. 22-29
    • Otu, H.H.1    Sayood, K.2
  • 116
    • 0242643741 scopus 로고    scopus 로고
    • A new sequence distance measure for phylogenetic tree construction
    • Otu,H.H. and Sayood,K. (2003b) A new sequence distance measure for phylogenetic tree construction. Bioinformatics, 19, 2122-2130.
    • (2003) Bioinformatics , vol.19 , pp. 2122-2130
    • Otu, H.H.1    Sayood, K.2
  • 118
    • 0035941029 scopus 로고    scopus 로고
    • Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21
    • Patil,N. et al. (2001) Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294, 1719-1723.
    • (2001) Science , vol.294 , pp. 1719-1723
    • Patil, N.1
  • 119
    • 34547811565 scopus 로고    scopus 로고
    • Protein structure comparison through fuzzy contact maps and the universal similarity metric
    • Universitat Politécnica de Catalunya, pp
    • Pelta,D.A. et al. (2005) Protein structure comparison through fuzzy contact maps and the universal similarity metric. In Proceedings of the Joint 4th EUSFLAT 11th LFA Conference (EUSFLAT-LFA 05). Universitat Politécnica de Catalunya, pp. 1124-1129.
    • (2005) Proceedings of the Joint 4th EUSFLAT 11th LFA Conference (EUSFLAT-LFA 05) , pp. 1124-1129
    • Pelta, D.A.1
  • 122
    • 67649143444 scopus 로고    scopus 로고
    • Reinert,G. et al. (2005) Statistics on words with applications to biological sequences. In Lotaire,M. (ed), Applied Combinatorics on Words. 105 of Encyclopedia of Mathematics and its Applications, Cambridge University Press, pp. 252-323.
    • Reinert,G. et al. (2005) Statistics on words with applications to biological sequences. In Lotaire,M. (ed), Applied Combinatorics on Words. Vol. 105 of Encyclopedia of Mathematics and its Applications, Cambridge University Press, pp. 252-323.
  • 124
    • 2242469493 scopus 로고    scopus 로고
    • Coding and compression: A happy union of theory and practice
    • Rissanen,J. and Yu,B. (2000) Coding and compression: A happy union of theory and practice. Am. Stat. Assoc., 95, 986-989.
    • (2000) Am. Stat. Assoc , vol.95 , pp. 986-989
    • Rissanen, J.1    Yu, B.2
  • 125
    • 67649166049 scopus 로고    scopus 로고
    • Editorial: Information theoretic methods in bioinformatics
    • Rissanen,J. et al. (2007) Editorial: Information theoretic methods in bioinformatics. EURASIP J. Bioinform. Syst. Biol., 7, 1-4.
    • (2007) EURASIP J. Bioinform. Syst. Biol , vol.7 , pp. 1-4
    • Rissanen, J.1
  • 126
    • 0029852415 scopus 로고    scopus 로고
    • Compression and genetic sequences analysis
    • Rivals,É. et al. (1996a) Compression and genetic sequences analysis. Biochimie, 78, 315-322.
    • (1996) Biochimie , vol.78 , pp. 315-322
    • Rivals, E.1
  • 127
    • 26444510631 scopus 로고    scopus 로고
    • A guaranteed compression scheme for repetitive DNA sequences
    • IEEE Computer Society, p
    • Rivals,É. et al. (1996b). A guaranteed compression scheme for repetitive DNA sequences. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, p. 453.
    • (1996) Proceedings of the IEEE Data Compression Conference (DCC) , pp. 453
    • Rivals, E.1
  • 128
    • 0030935318 scopus 로고    scopus 로고
    • Detection of significant patterns by compression algorithms: The case of approximate tandem repeats in DNA sequences
    • Rivals,É. et al. (1997a) Detection of significant patterns by compression algorithms: The case of approximate tandem repeats in DNA sequences. Comput. Appl. Biosci., 13, 131-136.
    • (1997) Comput. Appl. Biosci , vol.13 , pp. 131-136
    • Rivals, E.1
  • 129
    • 0007475289 scopus 로고    scopus 로고
    • Fast discerning repeats in DNA sequences with a compression algorithm
    • Universal Academy Press, Tokyo, pp
    • Rivals,É. et al. (1997b) Fast discerning repeats in DNA sequences with a compression algorithm. In Proceedings of Genome Informatics Workshop. Universal Academy Press, Tokyo, pp. 215-226.
    • (1997) Proceedings of Genome Informatics Workshop , pp. 215-226
    • Rivals, E.1
  • 130
    • 62449177287 scopus 로고    scopus 로고
    • Compression ratios based on the Universal Similarity Metric still yield protein distances far from CATH distances
    • abs/q-bio/0603007
    • Rocha,J. et al. (2006) Compression ratios based on the Universal Similarity Metric still yield protein distances far from CATH distances. CoRR, abs/q-bio/0603007.
    • (2006) CoRR
    • Rocha, J.1
  • 131
    • 0030282113 scopus 로고    scopus 로고
    • The power of amnesia: Learning probabilistic automata with variable memory length
    • Springer, Netherlands, pp
    • Ron,D. and Singer,Y. (1996) The power of amnesia: Learning probabilistic automata with variable memory length. In Machine Learning. Springer, Netherlands, pp. 117-149.
    • (1996) Machine Learning , pp. 117-149
    • Ron, D.1    Singer, Y.2
  • 132
    • 0035747893 scopus 로고    scopus 로고
    • Indexing huge genome sequences for solving various problems
    • Sadakane,K. and Shibyya,T. (2001) Indexing huge genome sequences for solving various problems. Genome Inform., 12, 175-183.
    • (2001) Genome Inform , vol.12 , pp. 175-183
    • Sadakane, K.1    Shibyya, T.2
  • 133
    • 0031558556 scopus 로고    scopus 로고
    • Estimating the entropy of DNA sequences
    • Schmidt,A.O. and Herzel,H. (1997) Estimating the entropy of DNA sequences. J. Theor. Biol., 188, 369-377.
    • (1997) J. Theor. Biol , vol.188 , pp. 369-377
    • Schmidt, A.O.1    Herzel, H.2
  • 134
    • 0023042012 scopus 로고
    • Information content of binding sites on nucleotide sequences
    • Schneider,T.D. et al. (1986) Information content of binding sites on nucleotide sequences. J. Mol. Biol., 188, 415-431.
    • (1986) J. Mol. Biol , vol.188 , pp. 415-431
    • Schneider, T.D.1
  • 136
    • 65749098156 scopus 로고    scopus 로고
    • Compression and machine learning: A new perspective on feature space vectors
    • IEEE Computer Society, pp
    • Sculley,D. and Brodley,C. (2006) Compression and machine learning: A new perspective on feature space vectors. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, pp. 332-332.
    • (2006) Proceedings of the IEEE Data Compression Conference (DCC) , pp. 332-332
    • Sculley, D.1    Brodley, C.2
  • 137
    • 33645732240 scopus 로고    scopus 로고
    • Modeling cellular machinery through biological network comparison
    • Sharan,R. and Ideker,T. (2006) Modeling cellular machinery through biological network comparison. Nat. Biotechnol., 24, 427-433.
    • (2006) Nat. Biotechnol , vol.24 , pp. 427-433
    • Sharan, R.1    Ideker, T.2
  • 139
    • 0019887799 scopus 로고
    • Identification of common molecular subsequences
    • Smith,T. and Waterman,M. (1981) Identification of common molecular subsequences. J. Mol. Biol., 147, 195-197.
    • (1981) J. Mol. Biol , vol.147 , pp. 195-197
    • Smith, T.1    Waterman, M.2
  • 140
    • 0035665965 scopus 로고    scopus 로고
    • Discovering patterns in plasmodium falciparum genomic DNA
    • Stern,L. et al. (2001) Discovering patterns in plasmodium falciparum genomic DNA. Mol. Biochem. Parasitol., 118, 175-186.
    • (2001) Mol. Biochem. Parasitol , vol.118 , pp. 175-186
    • Stern, L.1
  • 141
    • 0020190931 scopus 로고
    • Data compression via textual substitution
    • Storer,J.A. and Szymanski,T.G. (1982) Data compression via textual substitution. J. ACM, 29, 928-951.
    • (1982) J. ACM , vol.29 , pp. 928-951
    • Storer, J.A.1    Szymanski, T.G.2
  • 143
    • 34547630306 scopus 로고    scopus 로고
    • DNA sequence compression using the normalized maximum likelihood model for discrete regression
    • IEEE Computer Society, pp
    • Tabus,I. et al. (2003) DNA sequence compression using the normalized maximum likelihood model for discrete regression. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, pp. 253-262.
    • (2003) Proceedings of the IEEE Data Compression Conference (DCC) , pp. 253-262
    • Tabus, I.1
  • 144
    • 33646005790 scopus 로고    scopus 로고
    • The average common substring approach to phylogenomic reconstruction
    • Ulitsky,I. et al. (2006) The average common substring approach to phylogenomic reconstruction. J. Comput. Biol., 13, 336-350.
    • (2006) J. Comput. Biol , vol.13 , pp. 336-350
    • Ulitsky, I.1
  • 145
    • 34047188666 scopus 로고    scopus 로고
    • Compressed suffix tree - a basis for genome-scale sequence analysis
    • Välimäki,N. et al. (2007) Compressed suffix tree - a basis for genome-scale sequence analysis. Bioinformatics, 23, 629-630.
    • (2007) Bioinformatics , vol.23 , pp. 629-630
    • Välimäki, N.1
  • 146
    • 0032891717 scopus 로고    scopus 로고
    • Transformation distances: A family of dissimilarity measures based on movements of segments
    • Varré,J.-S. et al. (1999) Transformation distances: A family of dissimilarity measures based on movements of segments. Bioinformatics 15, 194-202.
    • (1999) Bioinformatics , vol.15 , pp. 194-202
    • Varré, J.-S.1
  • 147
    • 0037342499 scopus 로고    scopus 로고
    • Alignment-free sequence comparison: A review
    • Vinga,S. and Almeida,J.S. (2003) Alignment-free sequence comparison: A review. Bioinformatics, 19, 513-523.
    • (2003) Bioinformatics , vol.19 , pp. 513-523
    • Vinga, S.1    Almeida, J.S.2
  • 148
    • 6344221592 scopus 로고    scopus 로고
    • Reńyi continuous entropy of DNA sequences
    • Vinga,S. andAlmeida,J.S. (2004) Reńyi continuous entropy of DNA sequences. J. Theor. Biol., 231, 377-388.
    • (2004) J. Theor. Biol , vol.231 , pp. 377-388
    • Vinga, S.1    andAlmeida, J.S.2
  • 149
    • 38949102609 scopus 로고    scopus 로고
    • Local Reńyi entropic profiles of DNA sequences
    • Vinga,S. and Almeida,J.S. (2007) Local Reńyi entropic profiles of DNA sequences. BMC Bioinform., 8, 393.
    • (2007) BMC Bioinform , vol.8 , pp. 393
    • Vinga, S.1    Almeida, J.S.2
  • 150
    • 84935113569 scopus 로고    scopus 로고
    • Viterbi,A.J. (1967) Error bounds for convolution codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13, 260-269.
    • Viterbi,A.J. (1967) Error bounds for convolution codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13, 260-269.
  • 152
    • 35448929046 scopus 로고    scopus 로고
    • Compressing table data with column dependency
    • Vo,B.D. and Vo,K.-P. (2007) Compressing table data with column dependency. Theor. Comput. Sci., 387, 273-283.
    • (2007) Theor. Comput. Sci , vol.387 , pp. 273-283
    • Vo, B.D.1    Vo, K.-P.2
  • 154
    • 0027941109 scopus 로고
    • Disovering active motifs in sets of related proteins and using them for classification
    • Wang,J.T.L. et al. (1994) Disovering active motifs in sets of related proteins and using them for classification. Nucl. Acids Res. 22, 2769-2775.
    • (1994) Nucl. Acids Res , vol.22 , pp. 2769-2775
    • Wang, J.T.L.1
  • 155
    • 67649186176 scopus 로고    scopus 로고
    • Distribution of recombination crossovers and the origin of haplotype blocks: The interplay of population history, recombination, and mutation
    • Wang,N. et al. (2002) Distribution of recombination crossovers and the origin of haplotype blocks: The interplay of population history, recombination, and mutation. Am. J. Hum. Genet., 29, 229-232.
    • (2002) Am. J. Hum. Genet , vol.29 , pp. 229-232
    • Wang, N.1
  • 157
    • 0032554318 scopus 로고    scopus 로고
    • Correlations in protein sequences and property codes
    • Weiss,O. and Herzel,H. (1998) Correlations in protein sequences and property codes. J. Theor. Biol., 190, 341-353.
    • (1998) J. Theor. Biol , vol.190 , pp. 341-353
    • Weiss, O.1    Herzel, H.2
  • 158
    • 0034619248 scopus 로고    scopus 로고
    • Information content of protein sequences
    • Weiss,O. et al. (2000) Information content of protein sequences. J. Theor. Biol., 206, 379-386.
    • (2000) J. Theor. Biol , vol.206 , pp. 379-386
    • Weiss, O.1
  • 160
    • 0037188541 scopus 로고    scopus 로고
    • A dynamic programming algorithm for haplotype block partitioning
    • Zhang,K. et al. (2002) A dynamic programming algorithm for haplotype block partitioning. In Proc. Natl Acad. Sci. USA, 7335-7339.
    • (2002) Proc. Natl Acad. Sci. USA , pp. 7335-7339
    • Zhang, K.1
  • 161
    • 39549096223 scopus 로고    scopus 로고
    • Zhang,S. et al. (2008) Biomolecular network querying: A promising approach in systems biology. BMC Syst. Biol., 2, 5.
    • Zhang,S. et al. (2008) Biomolecular network querying: A promising approach in systems biology. BMC Syst. Biol., 2, 5.
  • 162
    • 33749131719 scopus 로고    scopus 로고
    • Feature selection for microarray data analysis using mutual information and rough set theory
    • Springer, Boston, pp
    • Zhou,W. et al. (2007) Feature selection for microarray data analysis using mutual information and rough set theory. In IFIP International Federation for Information Processing, Vol. 204, Springer, Boston, pp. 916-927.
    • (2007) IFIP International Federation for Information Processing , vol.204 , pp. 916-927
    • Zhou, W.1
  • 163
    • 1842450622 scopus 로고    scopus 로고
    • Gene clustering based on clusterwide mutual information
    • Zhou,X. et al. (2004) Gene clustering based on clusterwide mutual information. J. Comput. Biol., 11, 147-161.
    • (2004) J. Comput. Biol , vol.11 , pp. 147-161
    • Zhou, X.1
  • 164
    • 0023979656 scopus 로고
    • On classification with empirically observed statistics and universal data compression
    • Ziv,J. (1988) On classification with empirically observed statistics and universal data compression. IEEE Trans. Inform. Theory, 34, 278-286.
    • (1988) IEEE Trans. Inform. Theory , vol.34 , pp. 278-286
    • Ziv, J.1
  • 165
    • 41949122106 scopus 로고    scopus 로고
    • On finite memory universal data compression and classification of individual sequences
    • Ziv,J. (2008) On finite memory universal data compression and classification of individual sequences. IEEE Trans. Inform. Theory 54, 1626-1636.
    • (2008) IEEE Trans. Inform. Theory , vol.54 , pp. 1626-1636
    • Ziv, J.1
  • 166
    • 0017493286 scopus 로고
    • A universal algorithm for sequential data compression
    • Ziv,J. and Lempel,A. (1977) A universal algorithm for sequential data compression. IEEE Trans. Inform. Theory, 23, 337-343.
    • (1977) IEEE Trans. Inform. Theory , vol.23 , pp. 337-343
    • Ziv, J.1    Lempel, A.2
  • 167
    • 0018019231 scopus 로고
    • Compression of individual sequences via variable-rate coding
    • Ziv,J. and Lempel,A. (1978) Compression of individual sequences via variable-rate coding. IEEE Trans. Inform. Theory, 24, 530-536.
    • (1978) IEEE Trans. Inform. Theory , vol.24 , pp. 530-536
    • Ziv, J.1    Lempel, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.