메뉴 건너뛰기




Volumn 47, Issue 6, 2007, Pages 2098-2109

Lossless compression of chemical fingerprints using integer entropy codes improves storage and retrieval

Author keywords

[No Author keywords available]

Indexed keywords

BIOMETRICS; DATA STORAGE EQUIPMENT; ENTROPY; INFORMATION RETRIEVAL; MOLECULAR DYNAMICS; STATISTICAL METHODS;

EID: 37249011239     PISSN: 15499596     EISSN: 1549960X     Source Type: Journal    
DOI: 10.1021/ci700200n     Document Type: Article
Times cited : (42)

References (28)
  • 1
    • 0036567220 scopus 로고    scopus 로고
    • Fligner, M. A.; Verducci, J. S.; Blower, P. E. A Modification of the Jaccard/Tanimoto Similarity Index for Diverse Selection of Chemical Compounds Using Binary Strings. Technometrics 2002, 44, 110-119.
    • Fligner, M. A.; Verducci, J. S.; Blower, P. E. A Modification of the Jaccard/Tanimoto Similarity Index for Diverse Selection of Chemical Compounds Using Binary Strings. Technometrics 2002, 44, 110-119.
  • 2
    • 0001232509 scopus 로고    scopus 로고
    • On the properties of bit string-based measures of chemical similarity
    • Flower, D. R. On the properties of bit string-based measures of chemical similarity. J. Chem. Inf. Comput. Set 1998, 38, 378-386.
    • (1998) J. Chem. Inf. Comput. Set , vol.38 , pp. 378-386
    • Flower, D.R.1
  • 3
    • 0004117251 scopus 로고    scopus 로고
    • Daylight Theory Manual
    • available at, accessed Oct 2007
    • James, C. A.; Weininger, D.; Delany, J. Daylight Theory Manual, 2004. Current 2007 version available at http://www.daylight.com/dayhtml/ doc/theory/index.html (accessed Oct 2007).
    • (2004) Current 2007 version
    • James, C.A.1    Weininger, D.2    Delany, J.3
  • 4
    • 0043201432 scopus 로고    scopus 로고
    • Profile scaling increases the similarity search performance of molecular fingerprints containing numerical descriptors and structural keys
    • Xue, L.; Godden, J. F.; Stahura, F. L.; Bajorath, J. Profile scaling increases the similarity search performance of molecular fingerprints containing numerical descriptors and structural keys. J. Chem. Inf. Comput. Sci. 2003, 43, 1218-1225.
    • (2003) J. Chem. Inf. Comput. Sci , vol.43 , pp. 1218-1225
    • Xue, L.1    Godden, J.F.2    Stahura, F.L.3    Bajorath, J.4
  • 5
    • 10044240762 scopus 로고    scopus 로고
    • Similarity search profiling reveals effects of fingerprint scaling in virtual screening
    • Xue, L.; Stahura, F. L.; Bajorath, J. Similarity search profiling reveals effects of fingerprint scaling in virtual screening. J. Chem. Inf. Comput. Sci. 2004, 44, 2032-2039.
    • (2004) J. Chem. Inf. Comput. Sci , vol.44 , pp. 2032-2039
    • Xue, L.1    Stahura, F.L.2    Bajorath, J.3
  • 7
    • 58149411184 scopus 로고
    • Features of similarity
    • Tversky, A. Features of similarity. Psychol. Rev. 1977, 84, 327-352.
    • (1977) Psychol. Rev , vol.84 , pp. 327-352
    • Tversky, A.1
  • 8
    • 33751390843 scopus 로고
    • Definition and role of similarity concepts in the chemical and physical sciences
    • Rouvray, D. Definition and role of similarity concepts in the chemical and physical sciences. J. Chem. Inf. Comput. Sci. 1992, 32, 580-586.
    • (1992) J. Chem. Inf. Comput. Sci , vol.32 , pp. 580-586
    • Rouvray, D.1
  • 9
    • 0036249270 scopus 로고    scopus 로고
    • Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings
    • Holliday, J. D.; Hu, C. Y.; Willett, P. Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Comb. Chem. High Throughput Screening 2002, 5, 155-166.
    • (2002) Comb. Chem. High Throughput Screening , vol.5 , pp. 155-166
    • Holliday, J.D.1    Hu, C.Y.2    Willett, P.3
  • 10
    • 26944486424 scopus 로고    scopus 로고
    • Swamidass, S. J.; Chen, J.; Bruand, J.; Phung, P.; Ralaivola, L.; Baldi, P. Kernels for small molecules and the prediction of mutagenicity, toxicity, and anti-cancer activity. Bioinformatics 2005, 21, 1359-368; Proceedings of the 2005 ISMB Conference.
    • Swamidass, S. J.; Chen, J.; Bruand, J.; Phung, P.; Ralaivola, L.; Baldi, P. Kernels for small molecules and the prediction of mutagenicity, toxicity, and anti-cancer activity. Bioinformatics 2005, 21, 1359-368; Proceedings of the 2005 ISMB Conference.
  • 11
    • 13844312649 scopus 로고    scopus 로고
    • ZINC-A free database of commercially available compounds for virtual screening
    • Irwin, J. J.; Shoichet, B. K. ZINC-A free database of commercially available compounds for virtual screening. J. Chem. Inf. Comput. Sci. 2005, 45, 177-182.
    • (2005) J. Chem. Inf. Comput. Sci , vol.45 , pp. 177-182
    • Irwin, J.J.1    Shoichet, B.K.2
  • 12
    • 27944507949 scopus 로고    scopus 로고
    • ChemDB: A public database of small molecules and related chemoinformatics resources
    • Chen, J.; Swamidass, S. J.; Dou, Y.; Baldi, J. B. P. ChemDB: a public database of small molecules and related chemoinformatics resources. Bioinformatics 2005, 21, 4133-4139.
    • (2005) Bioinformatics , vol.21 , pp. 4133-4139
    • Chen, J.1    Swamidass, S.J.2    Dou, Y.3    Baldi, J.B.P.4
  • 13
    • 34250813174 scopus 로고    scopus 로고
    • One- to Four- Dimensional Kernels for Small Molecules and Predictive Regression of Physical, Chemical, and Biological Properties
    • Azencott, C.; Ksikes, A.; Swamidass, S. J.; Chen, J.; Ralaivola, L.; Baldi, P. One- to Four- Dimensional Kernels for Small Molecules and Predictive Regression of Physical, Chemical, and Biological Properties. J. Chem. Inf. Model. 2007, 47, 965-974.
    • (2007) J. Chem. Inf. Model , vol.47 , pp. 965-974
    • Azencott, C.1    Ksikes, A.2    Swamidass, S.J.3    Chen, J.4    Ralaivola, L.5    Baldi, P.6
  • 14
    • 0030039619 scopus 로고    scopus 로고
    • The art and practice of structure-based drug design: A molecular modelling perspective
    • Bohacek, R. S.; McMartin, C.; Guida, W. C. The art and practice of structure-based drug design: a molecular modelling perspective. Med. Res. Rev. 1996, 16, 3-50.
    • (1996) Med. Res. Rev , vol.16 , pp. 3-50
    • Bohacek, R.S.1    McMartin, C.2    Guida, W.C.3
  • 15
    • 34250851446 scopus 로고    scopus 로고
    • A mathematical correction for fingerprint similarity measures to improve chemical retrieval
    • Swamidass, S.; Baldi, P. A mathematical correction for fingerprint similarity measures to improve chemical retrieval. J. Chem. Inf. Model. 2007, 47, 952-964.
    • (2007) J. Chem. Inf. Model , vol.47 , pp. 952-964
    • Swamidass, S.1    Baldi, P.2
  • 16
    • 0035966871 scopus 로고    scopus 로고
    • Detailed analysis of scoring functions for virtual screening
    • Stahl, M.; Rarey, M. Detailed analysis of scoring functions for virtual screening. J. Med. Chem. 2001, 44, 1035-1042.
    • (2001) J. Med. Chem , vol.44 , pp. 1035-1042
    • Stahl, M.1    Rarey, M.2
  • 17
    • 10244222365 scopus 로고    scopus 로고
    • Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures
    • Hert, J.; Willett, P.; Wilton, D. J.; Acklin, P.; Azzaoui, K.; Jacoby, E.; Schuffenhauer, A. Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org. Biomol. Chem. 2004, 2, 3256-3266.
    • (2004) Org. Biomol. Chem , vol.2 , pp. 3256-3266
    • Hert, J.1    Willett, P.2    Wilton, D.J.3    Acklin, P.4    Azzaoui, K.5    Jacoby, E.6    Schuffenhauer, A.7
  • 18
    • 5544290537 scopus 로고    scopus 로고
    • Similarity Searching of Chemical Databases Using Atom Environment Descriptors (MOL-PRINT 2D): Evaluation of Performance
    • Bender, A.; Mussa, H.; Glen, R.; Reiling, S. Similarity Searching of Chemical Databases Using Atom Environment Descriptors (MOL-PRINT 2D): Evaluation of Performance. J. Chem. Inf. Model. 2004, 44, 1708-1718.
    • (2004) J. Chem. Inf. Model , vol.44 , pp. 1708-1718
    • Bender, A.1    Mussa, H.2    Glen, R.3    Reiling, S.4
  • 19
    • 33749598013 scopus 로고    scopus 로고
    • Chemoin formatics analysis and learning in a data pipelining environment
    • Hassan, M.; Brown, R. D.; Varma-O'Brien, S.; Rogers, D. Chemoin formatics analysis and learning in a data pipelining environment. Mol. Diversity 2006, 10, 283-299.
    • (2006) Mol. Diversity , vol.10 , pp. 283-299
    • Hassan, M.1    Brown, R.D.2    Varma-O'Brien, S.3    Rogers, D.4
  • 20
    • 20444379619 scopus 로고    scopus 로고
    • Encoding and Decoding Graphical Chemical Structures as Two-Dimensional (PDF417) Barcodes
    • Karthikeyan, M.; Bender, A. Encoding and Decoding Graphical Chemical Structures as Two-Dimensional (PDF417) Barcodes. J. Chem. Inf. Model. 2005, 45, 572-580.
    • (2005) J. Chem. Inf. Model , vol.45 , pp. 572-580
    • Karthikeyan, M.1    Bender, A.2
  • 22
    • 0016486577 scopus 로고
    • Universal codeword sets and representations of the integers
    • Elias, P. Universal codeword sets and representations of the integers. IEEE Trans. Inf. Theory 1975, 21, 194-203.
    • (1975) IEEE Trans. Inf. Theory , vol.21 , pp. 194-203
    • Elias, P.1
  • 26
    • 34247228558 scopus 로고    scopus 로고
    • Swamidass, S.; Baldi, P. Bounds and algorithms for exact searches of chemical fingerprints in linear and sub-linear time. J. Chem. Inf. Model. 2007, 47, 302-317.
    • Swamidass, S.; Baldi, P. Bounds and algorithms for exact searches of chemical fingerprints in linear and sub-linear time. J. Chem. Inf. Model. 2007, 47, 302-317.
  • 27
    • 0026953429 scopus 로고
    • Random texts exhibit Zipf s-law-like word frequency distribution
    • Li, W. T. Random texts exhibit Zipf s-law-like word frequency distribution. IEEE Trans. Inf. Theory 1992, 38, 1842-1845.
    • (1992) IEEE Trans. Inf. Theory , vol.38 , pp. 1842-1845
    • Li, W.T.1
  • 28
    • 24744469980 scopus 로고    scopus 로고
    • Pareto distributions and Zipf 's law
    • Newman, M. E. J. Power laws, Pareto distributions and Zipf 's law. Contemp. Phys. 2005, 46, 323-351.
    • (2005) Contemp. Phys , vol.46 , pp. 323-351
    • Newman, M.1    Power laws, E.J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.