-
1
-
-
3042683172
-
Information theory in molecular biology
-
Adami,C. (2004) Information theory in molecular biology. Phys. Life Rev., 1, 3-22.
-
(2004)
Phys. Life Rev
, vol.1
, pp. 3-22
-
-
Adami, C.1
-
4
-
-
38749149558
-
Identifying statistical dependence in genomic sequences via mutual information estimates
-
Aktulga,H.M. et al. (2007) Identifying statistical dependence in genomic sequences via mutual information estimates. EURASIP J. Bioinform. Syst. Biol., 2007, 1-11.
-
(2007)
EURASIP J. Bioinform. Syst. Biol
, vol.2007
, pp. 1-11
-
-
Aktulga, H.M.1
-
5
-
-
0025116731
-
Minimum message length encoding and the comparison of macromolecules
-
Allison,L. andYee,C.N. (1990) Minimum message length encoding and the comparison of macromolecules. Bull. Math. Biol., 52, 431-453.
-
(1990)
Bull. Math. Biol
, vol.52
, pp. 431-453
-
-
Allison, L.1
andYee, C.N.2
-
6
-
-
0033630766
-
Sequence complexity for biological sequence analysis
-
Allison,L. et al. (1992) Sequence complexity for biological sequence analysis. Comput. Chem., 24, 43-55.
-
(1992)
Comput. Chem
, vol.24
, pp. 43-55
-
-
Allison, L.1
-
8
-
-
0025183708
-
Basic local alignment search tool
-
Altshul,S.F. et al. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403-410.
-
(1990)
J. Mol. Biol
, vol.215
, pp. 403-410
-
-
Altshul, S.F.1
-
9
-
-
0041664867
-
Finding haplotype block boundaries by using the minimum-description-length principle
-
Anderson,E.C. and Novembre,J. (2003) Finding haplotype block boundaries by using the minimum-description-length principle. Am. J. Hum. Genet. 73, 336-354.
-
(2003)
Am. J. Hum. Genet
, vol.73
, pp. 336-354
-
-
Anderson, E.C.1
Novembre, J.2
-
13
-
-
34248385154
-
Mining, compressing and classifying with extensible motifs
-
Apostolico,A. et al. (2006) Mining, compressing and classifying with extensible motifs. Alg. Mol. Biol., 1, 4.
-
(2006)
Alg. Mol. Biol
, vol.1
, pp. 4
-
-
Apostolico, A.1
-
15
-
-
67649141138
-
-
A DNA sequence compression algorithm based on LUT and LZ77, abs/cs/0504100
-
Bao,S. et al. (2005) A DNA sequence compression algorithm based on LUT and LZ77. CoRR, abs/cs/0504100.
-
(2005)
CoRR
-
-
Bao, S.1
-
16
-
-
0032183995
-
The minimum description length principle in coding and modeling
-
Barron,A.R. et al. (1998) The minimum description length principle in coding and modeling. IEEE Trans. Inform. Theory, 44, 2743-2760.
-
(1998)
IEEE Trans. Inform. Theory
, vol.44
, pp. 2743-2760
-
-
Barron, A.R.1
-
17
-
-
16844376909
-
Reverse engineering of regulatory networks in human B cells
-
Basso,K. et al. (2003) Reverse engineering of regulatory networks in human B cells. Nat. Genet., 37, 382-390.
-
(2003)
Nat. Genet
, vol.37
, pp. 382-390
-
-
Basso, K.1
-
18
-
-
26444479436
-
-
Behzadi,B. and Fessant,F.L. (2005) DNA compression challenge revisited: a dynamic programming approach. In CPM, Springer, pp. 190-200.
-
Behzadi,B. and Fessant,F.L. (2005) DNA compression challenge revisited: a dynamic programming approach. In CPM, Springer, pp. 190-200.
-
-
-
-
19
-
-
0035109647
-
Variations on probabilistic suffix trees: Statistical modeling and prediction of protein families
-
Bejerano,G. and Yona,G. (2001) Variations on probabilistic suffix trees: statistical modeling and prediction of protein families. Bioinformatics, 17, 23-43.
-
(2001)
Bioinformatics
, vol.17
, pp. 23-43
-
-
Bejerano, G.1
Yona, G.2
-
20
-
-
4944246972
-
Dynamical systems and computable information
-
Benci,V. et al. (2004) Dynamical systems and computable information. Discrete Contin. Dyna. Syst. B, 4, 935-960.
-
(2004)
Discrete Contin. Dyna. Syst. B
, vol.4
, pp. 935-960
-
-
Benci, V.1
-
21
-
-
53649092768
-
Compressing proteomes: The relevance of medium range correlations
-
Benedetto,D. et al. (2007) Compressing proteomes: The relevance of medium range correlations. EURASIP J. Bioinform. Syst. Biol., 2007, 1-8.
-
(2007)
EURASIP J. Bioinform. Syst. Biol
, vol.2007
, pp. 1-8
-
-
Benedetto, D.1
-
22
-
-
4243764255
-
Compositional segmentation and long-range fractal correlations in DNA sequences
-
Bernaola-Galván,P. et al. (1996) Compositional segmentation and long-range fractal correlations in DNA sequences. Phys. Rev. E, 53, 5181-5189.
-
(1996)
Phys. Rev. E
, vol.53
, pp. 5181-5189
-
-
Bernaola-Galván, P.1
-
23
-
-
0000460109
-
Decomposition of DNA sequence complexity
-
Bernaola-Galván,P. et al. (1999) Decomposition of DNA sequence complexity. Phys. Rev. Lett., 83, 3336-3339.
-
(1999)
Phys. Rev. Lett
, vol.83
, pp. 3336-3339
-
-
Bernaola-Galván, P.1
-
24
-
-
0034238085
-
Finding borders between coding and noncoding DNA regions by an entropic segmentation method
-
Bernaola-Galván,P. et al. (2000) Finding borders between coding and noncoding DNA regions by an entropic segmentation method. Phys. Rev. Lett., 85, 1342-1345.
-
(2000)
Phys. Rev. Lett
, vol.85
, pp. 1342-1345
-
-
Bernaola-Galván, P.1
-
25
-
-
0023472826
-
GpC-rich islands as gene markers in the vertebrate nucleus
-
Bird,A.P. (1987) GpC-rich islands as gene markers in the vertebrate nucleus. Trends Genet., 3, 342-347.
-
(1987)
Trends Genet
, vol.3
, pp. 342-347
-
-
Bird, A.P.1
-
27
-
-
2642530436
-
DNA sequence analysis linguistic tools: Contrast vocabularies, compositional spectra and linguistic complexity
-
Bolshoy,A. (2003) DNA sequence analysis linguistic tools: Contrast vocabularies, compositional spectra and linguistic complexity. Appl. Bioinform., 2, 103-112.
-
(2003)
Appl. Bioinform
, vol.2
, pp. 103-112
-
-
Bolshoy, A.1
-
29
-
-
84988951426
-
Algorithmic aspects in speech recognition: An introduction
-
Buchsbaum,A.L. and Giancarlo,R. (1997) Algorithmic aspects in speech recognition: An introduction. ACM J. Exp. Alg., 2, 1.
-
(1997)
ACM J. Exp. Alg
, vol.2
, pp. 1
-
-
Buchsbaum, A.L.1
Giancarlo, R.2
-
30
-
-
0033906346
-
Engineering the compression of massive tables: An experimental approach
-
ACM-SIAM, pp
-
Buchsbaum,A.L. et al. (2000) Engineering the compression of massive tables: An experimental approach. In SODA 00: Proceedings of the Symposium on Discrete Algorithms. ACM-SIAM, pp. 175-184.
-
(2000)
SODA 00: Proceedings of the Symposium on Discrete Algorithms
, pp. 175-184
-
-
Buchsbaum, A.L.1
-
31
-
-
4243175869
-
Improving table compression with combinatorial optimization
-
Buchsbaum,A.L. et al. (2003) Improving table compression with combinatorial optimization. J. ACM, 50, 825-851.
-
(2003)
J. ACM
, vol.50
, pp. 825-851
-
-
Buchsbaum, A.L.1
-
32
-
-
0003573193
-
A block-sorting lossless data compression algorithm
-
Technical Report 124, Digital Equipment Corporation
-
Burrows,M. and Wheeler,D. (1994) A block-sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation.
-
(1994)
-
-
Burrows, M.1
Wheeler, D.2
-
33
-
-
0033258311
-
Unsupervised knowledge discovery in medical databases using relevance networks
-
Hanley and Belfus, pp
-
Butte,A.J. and Kohane,I.S. (1999) Unsupervised knowledge discovery in medical databases using relevance networks. In Proceedings of the AMIA Symposium. Hanley and Belfus, pp. 711-715.
-
(1999)
Proceedings of the AMIA Symposium
, pp. 711-715
-
-
Butte, A.J.1
Kohane, I.S.2
-
34
-
-
0033655775
-
Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements
-
World Scientific, pp
-
Butte,A.J. and Kohane,I.S. (2000) Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In Proceedings of the Pacific Symposium on Biocomputing (PSB). World Scientific, pp. 415-426.
-
(2000)
Proceedings of the Pacific Symposium on Biocomputing (PSB)
, pp. 415-426
-
-
Butte, A.J.1
Kohane, I.S.2
-
35
-
-
0034710924
-
Discovering functional relationships between RNA expression and Chemotherapeutic susceptibility using relevance networks
-
Butte,A.J. et al. (2000) Discovering functional relationships between RNA expression and Chemotherapeutic susceptibility using relevance networks. In Proc. Natl Acad. Sci. USA, 12182-12186.
-
(2000)
Proc. Natl Acad. Sci. USA
, pp. 12182-12186
-
-
Butte, A.J.1
-
36
-
-
34547630480
-
A simple statistical algorithm for biological sequence compression
-
IEEE Computer Society, pp
-
Cao,M.D. et al. (2007) A simple statistical algorithm for biological sequence compression. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, pp. 43-52.
-
(2007)
Proceedings of the IEEE Data Compression Conference (DCC)
, pp. 43-52
-
-
Cao, M.D.1
-
37
-
-
1942489258
-
Informational complexity and functional activity of RNA structures
-
Carothers,J. et al. (2004) Informational complexity and functional activity of RNA structures. J. Am. Chem. Soc., 126, 5130-5137.
-
(2004)
J. Am. Chem. Soc
, vol.126
, pp. 5130-5137
-
-
Carothers, J.1
-
39
-
-
0036947893
-
DNACompress: Fast and effective DNA sequence compression
-
Chen,X. et al. (2002) DNACompress: Fast and effective DNA sequence compression. Bioinformatics, 18, 1696-1698.
-
(2002)
Bioinformatics
, vol.18
, pp. 1696-1698
-
-
Chen, X.1
-
41
-
-
34547666722
-
Biological networks: Comparison, conservation, and evolutionary via relative description length
-
Chor,B. and Tuller,T. (2007) Biological networks: Comparison, conservation, and evolutionary via relative description length. J. Comput. Biol., 14, 817-834.
-
(2007)
J. Comput. Biol
, vol.14
, pp. 817-834
-
-
Chor, B.1
Tuller, T.2
-
44
-
-
0033563426
-
Zones of low entropy in genomic sequence
-
Crochemore,M. and Vérin,R. (1999) Zones of low entropy in genomic sequence. Comput. Chem., 23, 275-282.
-
(1999)
Comput. Chem
, vol.23
, pp. 275-282
-
-
Crochemore, M.1
Vérin, R.2
-
45
-
-
0942266549
-
A sub-quadratic sequence alignment algorithm for unrestricted cost matrices
-
Crochemore,M. et al. (2003) A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. SIAM J. Comput., 32 1654-1673.
-
(2003)
SIAM J. Comput
, vol.32
, pp. 1654-1673
-
-
Crochemore, M.1
-
46
-
-
0034791035
-
High-resolution haplotype structure in the human genome
-
Daly,M.J. et al. (2001) High-resolution haplotype structure in the human genome. Nat. Genet., 29, 229-232.
-
(2001)
Nat. Genet
, vol.29
, pp. 229-232
-
-
Daly, M.J.1
-
47
-
-
45149113022
-
Comparative analysis of long DNA sequences by per element information content using different contexts
-
Dix,T.I. et al. (2007) Comparative analysis of long DNA sequences by per element information content using different contexts. BMC Bioinformatics, 8(Suppl. 2), s10.
-
(2007)
BMC Bioinformatics
, vol.8
, Issue.SUPPL. 2
-
-
Dix, T.I.1
-
49
-
-
53649106963
-
MicroRNA target detection and analysis for genes related to breast cancer using MDLcompress
-
Evans,S.C. et al. (2007) MicroRNA target detection and analysis for genes related to breast cancer using MDLcompress. EURASIP J. Bioinform. Syst. Biol., 2007, 1-16.
-
(2007)
EURASIP J. Bioinform. Syst. Biol
, vol.2007
, pp. 1-16
-
-
Evans, S.C.1
-
50
-
-
84994364597
-
On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence
-
ACM-SIAM, pp
-
Farach,M. et al. (1995) On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence. In SODA 95: Proceedings of the Symposium on Discrete Algorithms. ACM-SIAM, pp. 48-57.
-
(1995)
SODA 95: Proceedings of the Symposium on Discrete Algorithms
, pp. 48-57
-
-
Farach, M.1
-
51
-
-
34547753523
-
Compression-based classification of biological sequences and structures via the Universal Similarity Metric: Experimental assessment
-
Ferragina,P. et al. (2007) Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment. BMC Bioinformatis, 8, 252.
-
(2007)
BMC Bioinformatis
, vol.8
, pp. 252
-
-
Ferragina, P.1
-
52
-
-
84979917497
-
Compressed text indexes: From theory to practice
-
Ferragina,P. et al. (2008) Compressed text indexes: From theory to practice. ACM J. Exp. Alg., 13.
-
(2008)
ACM J. Exp. Alg
, vol.13
-
-
Ferragina, P.1
-
54
-
-
18444369013
-
The structure of haplotype blocks in the human genome
-
Gabriel,S. et al. (2002) The structure of haplotype blocks in the human genome. Science, 26, 2225-2229.
-
(2002)
Science
, vol.26
, pp. 2225-2229
-
-
Gabriel, S.1
-
55
-
-
67649180696
-
-
Set-based complexity and biological information, abs/0801.4024
-
Galas,D.J. et al. (2008) Set-based complexity and biological information. CoRR, abs/0801.4024.
-
(2008)
CoRR
-
-
Galas, D.J.1
-
57
-
-
0242405250
-
Dynamic programming: Special cases
-
Apostolico,A. and Galil,Z, eds, Oxford University Press, pp
-
Giancarlo,R. (1997). Dynamic programming: Special cases. In Apostolico,A. and Galil,Z. (eds), Pattern Matching Algorithms. Oxford University Press, pp. 201-236.
-
(1997)
Pattern Matching Algorithms
, pp. 201-236
-
-
Giancarlo, R.1
-
58
-
-
67649186179
-
Alignment-free comparison of TOPS strings
-
College Publications, pp
-
Gilbert,D. et al. (2007) Alignment-free comparison of TOPS strings. In Proceedings of London Algorithmics and Stringology. College Publications, pp. 177-197.
-
(2007)
Proceedings of London Algorithmics and Stringology
, pp. 177-197
-
-
Gilbert, D.1
-
62
-
-
0000100455
-
A new challenge for compression algorithms: Genetic sequences
-
Grümbach,S. and Tahi,F. (1994) A new challenge for compression algorithms: Genetic sequences. Inform. Process. Manage., 30, 875-886.
-
(1994)
Inform. Process. Manage
, vol.30
, pp. 875-886
-
-
Grümbach, S.1
Tahi, F.2
-
65
-
-
0026466830
-
Identifying constraints on the higher-order structure of RNA: Continued development and application of comparative sequence analysis methods
-
Gutell,R.R. et al. (1992) Identifying constraints on the higher-order structure of RNA: Continued development and application of comparative sequence analysis methods. Nucleic Acids Res., 20 5785-5795.
-
(1992)
Nucleic Acids Res
, vol.20
, pp. 5785-5795
-
-
Gutell, R.R.1
-
66
-
-
34347390455
-
Comparing segmentations by applying randomization techniques
-
Haiminen,N. et al. (2007) Comparing segmentations by applying randomization techniques. BMC Bioinformatics, 7, 171.
-
(2007)
BMC Bioinformatics
, vol.7
, pp. 171
-
-
Haiminen, N.1
-
67
-
-
25144456056
-
Computational cluster validation in post-genomic data analysis
-
Handl,J. et al. (2005) Computational cluster validation in post-genomic data analysis. Bioinformatics, 21, 3201-3212.
-
(2005)
Bioinformatics
, vol.21
, pp. 3201-3212
-
-
Handl, J.1
-
68
-
-
22844441552
-
Reverse engineering gene regulatory networks
-
Hartemink,A. (2005) Reverse engineering gene regulatory networks. Nat. Biotechnol., 23, 554-556.
-
(2005)
Nat. Biotechnol
, vol.23
, pp. 554-556
-
-
Hartemink, A.1
-
70
-
-
0142028977
-
Annotating large genomes with exact word matches
-
Healy,J. et al. (2003) Annotating large genomes with exact word matches. Genome Res., 13, 2306-2315.
-
(2003)
Genome Res
, vol.13
, pp. 2306-2315
-
-
Healy, J.1
-
71
-
-
38049051093
-
Advances in Intelligent Data Analysis VII (IDA 2007
-
Recurrent predictive models for sequence segmentation, of, Springer, Berlin, pp
-
Hyvonen,S. et al. (2007) Recurrent predictive models for sequence segmentation. In Advances in Intelligent Data Analysis VII (IDA 2007 Vol. 4723 of LNCS. Springer, Berlin, pp. 195-206.
-
(2007)
LNCS
, vol.4723
, pp. 195-206
-
-
Hyvonen, S.1
-
72
-
-
0030670589
-
Efficient discovery of conserved patterns using a pattern graph
-
Jonassen,I. (1997) Efficient discovery of conserved patterns using a pattern graph. Comput. Appl. Biosci., 13, 509-522.
-
(1997)
Comput. Appl. Biosci
, vol.13
, pp. 509-522
-
-
Jonassen, I.1
-
74
-
-
32544454688
-
Application of compression-based distance measures to protein sequence classification: A methodological study
-
Kocsor,A. et al. (2005) Application of compression-based distance measures to protein sequence classification: A methodological study. Bioinformatics, 22, 407-412.
-
(2005)
Bioinformatics
, vol.22
, pp. 407-412
-
-
Kocsor, A.1
-
75
-
-
0041989761
-
An MDL method for finding haplotype blocks and for estimating the strength of Haplotype block boundaries
-
World Scientific, pp
-
Koivisto,M. (2003)An MDL method for finding haplotype blocks and for estimating the strength of Haplotype block boundaries. In Proceedings of the Pacific Symposium on Biocomputing (PSB). World Scientific, pp. 502-513.
-
(2003)
Proceedings of the Pacific Symposium on Biocomputing (PSB)
, pp. 502-513
-
-
Koivisto, M.1
-
76
-
-
37549034468
-
Information theories in molecular biology and genomics
-
Konopka,A.K. (2005) Information theories in molecular biology and genomics. Nat. Encyclopedia Hum. Genome, 3, 464-469.
-
(2005)
Nat. Encyclopedia Hum. Genome
, vol.3
, pp. 464-469
-
-
Konopka, A.K.1
-
77
-
-
13844281512
-
An efficient normalized maximum likelihood algorithm for DNA sequence compression
-
Korodi,G. and Tabus,I. (2005) An efficient normalized maximum likelihood algorithm for DNA sequence compression. ACM Trans. Inform. Syst., 23, 3-34.
-
(2005)
ACM Trans. Inform. Syst
, vol.23
, pp. 3-34
-
-
Korodi, G.1
Tabus, I.2
-
79
-
-
2442662802
-
Measuring the similarity of protein structures by means of the Universal Similarity Metric
-
Krasnogor,N. and Pelta,D.A. (2004) Measuring the similarity of protein structures by means of the Universal Similarity Metric. Bioinformatics, 20, 1015-1021.
-
(2004)
Bioinformatics
, vol.20
, pp. 1015-1021
-
-
Krasnogor, N.1
Pelta, D.A.2
-
80
-
-
1542268964
-
Study of DNA binding sites using the Rényi parametric entropy measure
-
Krishnamachari,A. et al. (2004) Study of DNA binding sites using the Rényi parametric entropy measure. J. Theor. Biol., 227 429-436.
-
(2004)
J. Theor. Biol
, vol.227
, pp. 429-436
-
-
Krishnamachari, A.1
-
81
-
-
0003725141
-
-
Kruskal,J.B. and Sankoff,D, eds, Addison-Wesley, Reading, MA
-
Kruskal,J.B. and Sankoff,D. (eds) (1983) Time Wraps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, Reading, MA.
-
(1983)
Time Wraps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison
-
-
-
83
-
-
0016880887
-
On the complexity of finite sequences
-
Lempel,A. and Ziv,J. (1976) On the complexity of finite sequences. IEEE Trans. Inform. Theory, 22, 75-81.
-
(1976)
IEEE Trans. Inform. Theory
, vol.22
, pp. 75-81
-
-
Lempel, A.1
Ziv, J.2
-
85
-
-
0035102453
-
An Information-based sequence distance and its application to whole mitochondrial genome phylogeny
-
Li,M. et al. (2001) An Information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics 17, 149-154.
-
(2001)
Bioinformatics
, vol.17
, pp. 149-154
-
-
Li, M.1
-
86
-
-
10644294829
-
The similarity metric
-
Li,M. et al. (2003) The similarity metric. IEEE Trans. Inform. Theory, 50, 3250-3264.
-
(2003)
IEEE Trans. Inform. Theory
, vol.50
, pp. 3250-3264
-
-
Li, M.1
-
87
-
-
67349186481
-
-
Lifshits,Y. et al. (2008) Speeding upHMMdecoding and training by exploiting sequence repetitions. Algorithmica [doi 10.1007/s00453-007-9128-0].
-
Lifshits,Y. et al. (2008) Speeding upHMMdecoding and training by exploiting sequence repetitions. Algorithmica [doi 10.1007/s00453-007-9128-0].
-
-
-
-
88
-
-
0030596404
-
High statistics block entropy measures of DNA sequences
-
Lió,P. et al. (1996) High statistics block entropy measures of DNA sequences. J. Theor. Biol., 180, 151-160.
-
(1996)
J. Theor. Biol
, vol.180
, pp. 151-160
-
-
Lió, P.1
-
89
-
-
18844405663
-
Space-efficient whole genome comparisons with Burrows-Wheeler Transforms
-
Lippert,R.A. (2005) Space-efficient whole genome comparisons with Burrows-Wheeler Transforms. J. Comput. Biol., 12, 407-415.
-
(2005)
J. Comput. Biol
, vol.12
, pp. 407-415
-
-
Lippert, R.A.1
-
90
-
-
25644453578
-
A space-efficient construction of the Burrows-Wheeler transform for genomic data
-
Lippert,R.A. et al. (2005) A space-efficient construction of the Burrows-Wheeler transform for genomic data. J. Comput. Biol., 12, 943-951.
-
(2005)
J. Comput. Biol
, vol.12
, pp. 943-951
-
-
Lippert, R.A.1
-
91
-
-
39149105621
-
Comparison of TOPS strings based on LZ complexity
-
Liu,L. and Wang,T. (2008) Comparison of TOPS strings based on LZ complexity. J. Theor. Biol., 251, 159-166.
-
(2008)
J. Theor. Biol
, vol.251
, pp. 159-166
-
-
Liu, L.1
Wang, T.2
-
92
-
-
42549148061
-
RNACompress: Grammar-based compression and informational complexity measurement of RNA secondary structure
-
Liu,Q. et al. (2008) RNACompress: Grammar-based compression and informational complexity measurement of RNA secondary structure. BMC Bioinformatics, 9, 176+.
-
(2008)
BMC Bioinformatics
, vol.9
-
-
Liu, Q.1
-
93
-
-
0032919622
-
Significantly lower entropy estimates for natural DNA sequences
-
Loewenstern,D. and Yianilos,P.N. (1999) Significantly lower entropy estimates for natural DNA sequences. J. Comput. Biol., 6, 125-142.
-
(1999)
J. Comput. Biol
, vol.6
, pp. 125-142
-
-
Loewenstern, D.1
Yianilos, P.N.2
-
94
-
-
0012490449
-
DNA sequence classification using compression-based induction
-
Technical report, DIMACS
-
Loewenstern,D. et al. (1995) DNA sequence classification using compression-based induction. Technical report, DIMACS.
-
(1995)
-
-
Loewenstern, D.1
-
95
-
-
24344458137
-
Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy
-
Long,F. and Ding,C. (2005) Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 27, 1226-1238.
-
(2005)
IEEE Trans. Pattern Anal. Mach. Intell
, vol.27
, pp. 1226-1238
-
-
Long, F.1
Ding, C.2
-
97
-
-
52449094233
-
Short tandem repeats in human exons: A target for disease mutations
-
Madsen,B.E. et al. (2008) Short tandem repeats in human exons: A target for disease mutations. BMC Genomics, 9, 410+.
-
(2008)
BMC Genomics
, vol.9
-
-
Madsen, B.E.1
-
98
-
-
8344261403
-
A simple and fast DNA compressor
-
Manzini,G. and Rastero,M. (2005) A simple and fast DNA compressor. Softw. Pract. Exper., 35, 1397-1411.
-
(2005)
Softw. Pract. Exper
, vol.35
, pp. 1397-1411
-
-
Manzini, G.1
Rastero, M.2
-
100
-
-
33947305781
-
ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context
-
Margolin,A.A. et al. (2006a). ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics, 7(Suppl. 1), s7.
-
(2006)
BMC Bioinformatics
, vol.7
, Issue.SUPPL. 1
-
-
Margolin, A.A.1
-
101
-
-
33750353888
-
Reverse engineering cellular networks
-
Margolin,A.A. et al. (2006b) Reverse engineering cellular networks. Nat. Protocols, 1, 663-672.
-
(2006)
Nat. Protocols
, vol.1
, pp. 663-672
-
-
Margolin, A.A.1
-
102
-
-
0034578442
-
Biological sequence compression algorithms
-
Matsumoto,T. et al. (2000) Biological sequence compression algorithms. Genome Inform., 11, 43-52.
-
(2000)
Genome Inform
, vol.11
, pp. 43-52
-
-
Matsumoto, T.1
-
103
-
-
19044363196
-
Sublinear growth of information in DNA sequences
-
Menconi,G. (2004) Sublinear growth of information in DNA sequences. Bull. Math. Biol., 67, 737-759.
-
(2004)
Bull. Math. Biol
, vol.67
, pp. 737-759
-
-
Menconi, G.1
-
104
-
-
39049189269
-
A compression-based approach for coding sequences identifications in Prokaryotic Genomes
-
Menconi,G. and Marangoni,R. (2006). A compression-based approach for coding sequences identifications in Prokaryotic Genomes. J. Comput. Biol., 13, 1477-1488.
-
(2006)
J. Comput. Biol
, vol.13
, pp. 1477-1488
-
-
Menconi, G.1
Marangoni, R.2
-
105
-
-
36248999573
-
Information-Theoretic inference of large transcriptional regulatory networks
-
Meyer,P.E. et al. (2007) Information-Theoretic inference of large transcriptional regulatory networks. EURASIP J. Bioinform. Syst. Biol., 2007, 8.
-
(2007)
EURASIP J. Bioinform. Syst. Biol
, vol.2007
, pp. 8
-
-
Meyer, P.E.1
-
106
-
-
0011908356
-
Discovering dependencies via algorithmic mutual information: A case study in DNA sequence comparisons
-
Milosavljevic,A. (1995) Discovering dependencies via algorithmic mutual information: A case study in DNA sequence comparisons. Mach. Learn. 21, 35-50.
-
(1995)
Mach. Learn
, vol.21
, pp. 35-50
-
-
Milosavljevic, A.1
-
107
-
-
0027194328
-
Discovering simple DNA sequences by the algorithmic significance method
-
Milosavljevic,A. and Jurka,J. (1993) Discovering simple DNA sequences by the algorithmic significance method. Comput. Appli. Biosci., 9 407-411.
-
(1993)
Comput. Appli. Biosci
, vol.9
, pp. 407-411
-
-
Milosavljevic, A.1
Jurka, J.2
-
109
-
-
67649138818
-
-
Nature-Review (2008) Nature Reviews collection on microRNAs. Nat. Rev. [Epub ahead of print, doi:10.1038/nrg2202].
-
Nature-Review (2008) Nature Reviews collection on microRNAs. Nat. Rev. [Epub ahead of print, doi:10.1038/nrg2202].
-
-
-
-
111
-
-
0000523223
-
Compression and explanation using hierarchical grammars
-
Nevill-Manning,C.G. and Witten,I.H. (1997) Compression and explanation using hierarchical grammars. Comput. J., 40, 103-116.
-
(1997)
Comput. J
, vol.40
, pp. 103-116
-
-
Nevill-Manning, C.G.1
Witten, I.H.2
-
115
-
-
0037248694
-
Adivide-and-conquer approach to fragment assembly
-
Otu,H.H. and Sayood,K. (2003a)Adivide-and-conquer approach to fragment assembly. Bioinformatics, 19, 22-29.
-
(2003)
Bioinformatics
, vol.19
, pp. 22-29
-
-
Otu, H.H.1
Sayood, K.2
-
116
-
-
0242643741
-
A new sequence distance measure for phylogenetic tree construction
-
Otu,H.H. and Sayood,K. (2003b) A new sequence distance measure for phylogenetic tree construction. Bioinformatics, 19, 2122-2130.
-
(2003)
Bioinformatics
, vol.19
, pp. 2122-2130
-
-
Otu, H.H.1
Sayood, K.2
-
118
-
-
0035941029
-
Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21
-
Patil,N. et al. (2001) Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294, 1719-1723.
-
(2001)
Science
, vol.294
, pp. 1719-1723
-
-
Patil, N.1
-
119
-
-
34547811565
-
Protein structure comparison through fuzzy contact maps and the universal similarity metric
-
Universitat Politécnica de Catalunya, pp
-
Pelta,D.A. et al. (2005) Protein structure comparison through fuzzy contact maps and the universal similarity metric. In Proceedings of the Joint 4th EUSFLAT 11th LFA Conference (EUSFLAT-LFA 05). Universitat Politécnica de Catalunya, pp. 1124-1129.
-
(2005)
Proceedings of the Joint 4th EUSFLAT 11th LFA Conference (EUSFLAT-LFA 05)
, pp. 1124-1129
-
-
Pelta, D.A.1
-
122
-
-
67649143444
-
-
Reinert,G. et al. (2005) Statistics on words with applications to biological sequences. In Lotaire,M. (ed), Applied Combinatorics on Words. 105 of Encyclopedia of Mathematics and its Applications, Cambridge University Press, pp. 252-323.
-
Reinert,G. et al. (2005) Statistics on words with applications to biological sequences. In Lotaire,M. (ed), Applied Combinatorics on Words. Vol. 105 of Encyclopedia of Mathematics and its Applications, Cambridge University Press, pp. 252-323.
-
-
-
-
123
-
-
0002408684
-
On measures of entropy and information
-
University of California Press, Berkeley, CA, pp
-
Rényi,A. (1961) On measures of entropy and information. In Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability, Vol. 1, University of California Press, Berkeley, CA, pp. 547-561.
-
(1961)
Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability
, vol.1
, pp. 547-561
-
-
Rényi, A.1
-
124
-
-
2242469493
-
Coding and compression: A happy union of theory and practice
-
Rissanen,J. and Yu,B. (2000) Coding and compression: A happy union of theory and practice. Am. Stat. Assoc., 95, 986-989.
-
(2000)
Am. Stat. Assoc
, vol.95
, pp. 986-989
-
-
Rissanen, J.1
Yu, B.2
-
125
-
-
67649166049
-
Editorial: Information theoretic methods in bioinformatics
-
Rissanen,J. et al. (2007) Editorial: Information theoretic methods in bioinformatics. EURASIP J. Bioinform. Syst. Biol., 7, 1-4.
-
(2007)
EURASIP J. Bioinform. Syst. Biol
, vol.7
, pp. 1-4
-
-
Rissanen, J.1
-
126
-
-
0029852415
-
Compression and genetic sequences analysis
-
Rivals,É. et al. (1996a) Compression and genetic sequences analysis. Biochimie, 78, 315-322.
-
(1996)
Biochimie
, vol.78
, pp. 315-322
-
-
Rivals, E.1
-
127
-
-
26444510631
-
A guaranteed compression scheme for repetitive DNA sequences
-
IEEE Computer Society, p
-
Rivals,É. et al. (1996b). A guaranteed compression scheme for repetitive DNA sequences. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, p. 453.
-
(1996)
Proceedings of the IEEE Data Compression Conference (DCC)
, pp. 453
-
-
Rivals, E.1
-
128
-
-
0030935318
-
Detection of significant patterns by compression algorithms: The case of approximate tandem repeats in DNA sequences
-
Rivals,É. et al. (1997a) Detection of significant patterns by compression algorithms: The case of approximate tandem repeats in DNA sequences. Comput. Appl. Biosci., 13, 131-136.
-
(1997)
Comput. Appl. Biosci
, vol.13
, pp. 131-136
-
-
Rivals, E.1
-
129
-
-
0007475289
-
Fast discerning repeats in DNA sequences with a compression algorithm
-
Universal Academy Press, Tokyo, pp
-
Rivals,É. et al. (1997b) Fast discerning repeats in DNA sequences with a compression algorithm. In Proceedings of Genome Informatics Workshop. Universal Academy Press, Tokyo, pp. 215-226.
-
(1997)
Proceedings of Genome Informatics Workshop
, pp. 215-226
-
-
Rivals, E.1
-
130
-
-
62449177287
-
Compression ratios based on the Universal Similarity Metric still yield protein distances far from CATH distances
-
abs/q-bio/0603007
-
Rocha,J. et al. (2006) Compression ratios based on the Universal Similarity Metric still yield protein distances far from CATH distances. CoRR, abs/q-bio/0603007.
-
(2006)
CoRR
-
-
Rocha, J.1
-
131
-
-
0030282113
-
The power of amnesia: Learning probabilistic automata with variable memory length
-
Springer, Netherlands, pp
-
Ron,D. and Singer,Y. (1996) The power of amnesia: Learning probabilistic automata with variable memory length. In Machine Learning. Springer, Netherlands, pp. 117-149.
-
(1996)
Machine Learning
, pp. 117-149
-
-
Ron, D.1
Singer, Y.2
-
132
-
-
0035747893
-
Indexing huge genome sequences for solving various problems
-
Sadakane,K. and Shibyya,T. (2001) Indexing huge genome sequences for solving various problems. Genome Inform., 12, 175-183.
-
(2001)
Genome Inform
, vol.12
, pp. 175-183
-
-
Sadakane, K.1
Shibyya, T.2
-
133
-
-
0031558556
-
Estimating the entropy of DNA sequences
-
Schmidt,A.O. and Herzel,H. (1997) Estimating the entropy of DNA sequences. J. Theor. Biol., 188, 369-377.
-
(1997)
J. Theor. Biol
, vol.188
, pp. 369-377
-
-
Schmidt, A.O.1
Herzel, H.2
-
134
-
-
0023042012
-
Information content of binding sites on nucleotide sequences
-
Schneider,T.D. et al. (1986) Information content of binding sites on nucleotide sequences. J. Mol. Biol., 188, 415-431.
-
(1986)
J. Mol. Biol
, vol.188
, pp. 415-431
-
-
Schneider, T.D.1
-
136
-
-
65749098156
-
Compression and machine learning: A new perspective on feature space vectors
-
IEEE Computer Society, pp
-
Sculley,D. and Brodley,C. (2006) Compression and machine learning: A new perspective on feature space vectors. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, pp. 332-332.
-
(2006)
Proceedings of the IEEE Data Compression Conference (DCC)
, pp. 332-332
-
-
Sculley, D.1
Brodley, C.2
-
137
-
-
33645732240
-
Modeling cellular machinery through biological network comparison
-
Sharan,R. and Ideker,T. (2006) Modeling cellular machinery through biological network comparison. Nat. Biotechnol., 24, 427-433.
-
(2006)
Nat. Biotechnol
, vol.24
, pp. 427-433
-
-
Sharan, R.1
Ideker, T.2
-
139
-
-
0019887799
-
Identification of common molecular subsequences
-
Smith,T. and Waterman,M. (1981) Identification of common molecular subsequences. J. Mol. Biol., 147, 195-197.
-
(1981)
J. Mol. Biol
, vol.147
, pp. 195-197
-
-
Smith, T.1
Waterman, M.2
-
140
-
-
0035665965
-
Discovering patterns in plasmodium falciparum genomic DNA
-
Stern,L. et al. (2001) Discovering patterns in plasmodium falciparum genomic DNA. Mol. Biochem. Parasitol., 118, 175-186.
-
(2001)
Mol. Biochem. Parasitol
, vol.118
, pp. 175-186
-
-
Stern, L.1
-
141
-
-
0020190931
-
Data compression via textual substitution
-
Storer,J.A. and Szymanski,T.G. (1982) Data compression via textual substitution. J. ACM, 29, 928-951.
-
(1982)
J. ACM
, vol.29
, pp. 928-951
-
-
Storer, J.A.1
Szymanski, T.G.2
-
143
-
-
34547630306
-
DNA sequence compression using the normalized maximum likelihood model for discrete regression
-
IEEE Computer Society, pp
-
Tabus,I. et al. (2003) DNA sequence compression using the normalized maximum likelihood model for discrete regression. In Proceedings of the IEEE Data Compression Conference (DCC). IEEE Computer Society, pp. 253-262.
-
(2003)
Proceedings of the IEEE Data Compression Conference (DCC)
, pp. 253-262
-
-
Tabus, I.1
-
144
-
-
33646005790
-
The average common substring approach to phylogenomic reconstruction
-
Ulitsky,I. et al. (2006) The average common substring approach to phylogenomic reconstruction. J. Comput. Biol., 13, 336-350.
-
(2006)
J. Comput. Biol
, vol.13
, pp. 336-350
-
-
Ulitsky, I.1
-
145
-
-
34047188666
-
Compressed suffix tree - a basis for genome-scale sequence analysis
-
Välimäki,N. et al. (2007) Compressed suffix tree - a basis for genome-scale sequence analysis. Bioinformatics, 23, 629-630.
-
(2007)
Bioinformatics
, vol.23
, pp. 629-630
-
-
Välimäki, N.1
-
146
-
-
0032891717
-
Transformation distances: A family of dissimilarity measures based on movements of segments
-
Varré,J.-S. et al. (1999) Transformation distances: A family of dissimilarity measures based on movements of segments. Bioinformatics 15, 194-202.
-
(1999)
Bioinformatics
, vol.15
, pp. 194-202
-
-
Varré, J.-S.1
-
147
-
-
0037342499
-
Alignment-free sequence comparison: A review
-
Vinga,S. and Almeida,J.S. (2003) Alignment-free sequence comparison: A review. Bioinformatics, 19, 513-523.
-
(2003)
Bioinformatics
, vol.19
, pp. 513-523
-
-
Vinga, S.1
Almeida, J.S.2
-
148
-
-
6344221592
-
Reńyi continuous entropy of DNA sequences
-
Vinga,S. andAlmeida,J.S. (2004) Reńyi continuous entropy of DNA sequences. J. Theor. Biol., 231, 377-388.
-
(2004)
J. Theor. Biol
, vol.231
, pp. 377-388
-
-
Vinga, S.1
andAlmeida, J.S.2
-
149
-
-
38949102609
-
Local Reńyi entropic profiles of DNA sequences
-
Vinga,S. and Almeida,J.S. (2007) Local Reńyi entropic profiles of DNA sequences. BMC Bioinform., 8, 393.
-
(2007)
BMC Bioinform
, vol.8
, pp. 393
-
-
Vinga, S.1
Almeida, J.S.2
-
150
-
-
84935113569
-
-
Viterbi,A.J. (1967) Error bounds for convolution codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13, 260-269.
-
Viterbi,A.J. (1967) Error bounds for convolution codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13, 260-269.
-
-
-
-
152
-
-
35448929046
-
Compressing table data with column dependency
-
Vo,B.D. and Vo,K.-P. (2007) Compressing table data with column dependency. Theor. Comput. Sci., 387, 273-283.
-
(2007)
Theor. Comput. Sci
, vol.387
, pp. 273-283
-
-
Vo, B.D.1
Vo, K.-P.2
-
154
-
-
0027941109
-
Disovering active motifs in sets of related proteins and using them for classification
-
Wang,J.T.L. et al. (1994) Disovering active motifs in sets of related proteins and using them for classification. Nucl. Acids Res. 22, 2769-2775.
-
(1994)
Nucl. Acids Res
, vol.22
, pp. 2769-2775
-
-
Wang, J.T.L.1
-
155
-
-
67649186176
-
Distribution of recombination crossovers and the origin of haplotype blocks: The interplay of population history, recombination, and mutation
-
Wang,N. et al. (2002) Distribution of recombination crossovers and the origin of haplotype blocks: The interplay of population history, recombination, and mutation. Am. J. Hum. Genet., 29, 229-232.
-
(2002)
Am. J. Hum. Genet
, vol.29
, pp. 229-232
-
-
Wang, N.1
-
157
-
-
0032554318
-
Correlations in protein sequences and property codes
-
Weiss,O. and Herzel,H. (1998) Correlations in protein sequences and property codes. J. Theor. Biol., 190, 341-353.
-
(1998)
J. Theor. Biol
, vol.190
, pp. 341-353
-
-
Weiss, O.1
Herzel, H.2
-
158
-
-
0034619248
-
Information content of protein sequences
-
Weiss,O. et al. (2000) Information content of protein sequences. J. Theor. Biol., 206, 379-386.
-
(2000)
J. Theor. Biol
, vol.206
, pp. 379-386
-
-
Weiss, O.1
-
160
-
-
0037188541
-
A dynamic programming algorithm for haplotype block partitioning
-
Zhang,K. et al. (2002) A dynamic programming algorithm for haplotype block partitioning. In Proc. Natl Acad. Sci. USA, 7335-7339.
-
(2002)
Proc. Natl Acad. Sci. USA
, pp. 7335-7339
-
-
Zhang, K.1
-
161
-
-
39549096223
-
-
Zhang,S. et al. (2008) Biomolecular network querying: A promising approach in systems biology. BMC Syst. Biol., 2, 5.
-
Zhang,S. et al. (2008) Biomolecular network querying: A promising approach in systems biology. BMC Syst. Biol., 2, 5.
-
-
-
-
162
-
-
33749131719
-
Feature selection for microarray data analysis using mutual information and rough set theory
-
Springer, Boston, pp
-
Zhou,W. et al. (2007) Feature selection for microarray data analysis using mutual information and rough set theory. In IFIP International Federation for Information Processing, Vol. 204, Springer, Boston, pp. 916-927.
-
(2007)
IFIP International Federation for Information Processing
, vol.204
, pp. 916-927
-
-
Zhou, W.1
-
163
-
-
1842450622
-
Gene clustering based on clusterwide mutual information
-
Zhou,X. et al. (2004) Gene clustering based on clusterwide mutual information. J. Comput. Biol., 11, 147-161.
-
(2004)
J. Comput. Biol
, vol.11
, pp. 147-161
-
-
Zhou, X.1
-
164
-
-
0023979656
-
On classification with empirically observed statistics and universal data compression
-
Ziv,J. (1988) On classification with empirically observed statistics and universal data compression. IEEE Trans. Inform. Theory, 34, 278-286.
-
(1988)
IEEE Trans. Inform. Theory
, vol.34
, pp. 278-286
-
-
Ziv, J.1
-
165
-
-
41949122106
-
On finite memory universal data compression and classification of individual sequences
-
Ziv,J. (2008) On finite memory universal data compression and classification of individual sequences. IEEE Trans. Inform. Theory 54, 1626-1636.
-
(2008)
IEEE Trans. Inform. Theory
, vol.54
, pp. 1626-1636
-
-
Ziv, J.1
-
166
-
-
0017493286
-
A universal algorithm for sequential data compression
-
Ziv,J. and Lempel,A. (1977) A universal algorithm for sequential data compression. IEEE Trans. Inform. Theory, 23, 337-343.
-
(1977)
IEEE Trans. Inform. Theory
, vol.23
, pp. 337-343
-
-
Ziv, J.1
Lempel, A.2
-
167
-
-
0018019231
-
Compression of individual sequences via variable-rate coding
-
Ziv,J. and Lempel,A. (1978) Compression of individual sequences via variable-rate coding. IEEE Trans. Inform. Theory, 24, 530-536.
-
(1978)
IEEE Trans. Inform. Theory
, vol.24
, pp. 530-536
-
-
Ziv, J.1
Lempel, A.2
|