메뉴 건너뛰기




Volumn , Issue , 2013, Pages 1-31

What are our models really telling us? A practical tutorial on avoiding common mistakes when building predictive models

Author keywords

Applicability domain; Experimental error; Kendall tau; Model comparison; Pearson r; Predictive models

Indexed keywords

APPLICABILITY DOMAIN; EXPERIMENTAL ERRORS; KENDALL TAUS; MODEL COMPARISON; PEARSON R; PREDICTIVE MODELS;

EID: 85016064263     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.1002/9781118742785.ch1     Document Type: Chapter
Times cited : (8)

References (52)
  • 1
    • 79952171625 scopus 로고    scopus 로고
    • Probing the links between in vitro potency, ADMET and physicochemical parameters
    • Gleeson MP, Hersey A, Montanari D, et al. Probing the links between in vitro potency, ADMET and physicochemical parameters. Nat Rev Drug Discov 2011;10:197-208.
    • (2011) Nat Rev Drug Discov , vol.10 , pp. 197-208
    • Gleeson, M.P.1    Hersey, A.2    Montanari, D.3
  • 2
    • 13544270908 scopus 로고    scopus 로고
    • Predicting aqueous solubility from structure
    • Delaney JS. Predicting aqueous solubility from structure. Drug Discov Today 2005;10:289-295.
    • (2005) Drug Discov Today , vol.10 , pp. 289-295
    • Delaney, J.S.1
  • 3
    • 79955741760 scopus 로고    scopus 로고
    • Recent advances on aqueous solubility prediction
    • Wang J, Hou T. Recent advances on aqueous solubility prediction. Comb Chem High Throughput Screen 2011;14:328-338.
    • (2011) Comb Chem High Throughput Screen , vol.14 , pp. 328-338
    • Wang, J.1    Hou, T.2
  • 4
    • 39449138204 scopus 로고    scopus 로고
    • Why are some properties more difficult to predict than others? A study of QSPR models of solubility, melting point, and log P
    • Hughes LD, Palmer DS, Nigsch F, et al. Why are some properties more difficult to predict than others? A study of QSPR models of solubility, melting point, and log P. J Chem Info Model 2008;48:220-232.
    • (2008) J Chem Info Model , vol.48 , pp. 220-232
    • Hughes, L.D.1    Palmer, D.S.2    Nigsch, F.3
  • 5
    • 39149140400 scopus 로고    scopus 로고
    • High confidence predictions of drug-drug interactions: Predicting affinities for cytochrome P450 2C9 with multiple computational methods
    • Hudelson MG, Ketkar NS, Holder LB, et al. High confidence predictions of drug-drug interactions: Predicting affinities for cytochrome P450 2C9 with multiple computational methods. J Med Chem 2008;51:648-654.
    • (2008) J Med Chem , vol.51 , pp. 648-654
    • Hudelson, M.G.1    Ketkar, N.S.2    Holder, L.B.3
  • 6
    • 13844254976 scopus 로고    scopus 로고
    • Predictive in silico modeling for hERG channel blockers
    • Aronov AM. Predictive in silico modeling for hERG channel blockers. Drug Discov Today 2005;10:149-155.
    • (2005) Drug Discov Today , vol.10 , pp. 149-155
    • Aronov, A.M.1
  • 7
    • 33244474244 scopus 로고    scopus 로고
    • Development and evaluation of an in silico model for hERG binding
    • Song M, Clark M. Development and evaluation of an in silico model for hERG binding. J Chem Info Model 2006;46:392-400.
    • (2006) J Chem Info Model , vol.46 , pp. 392-400
    • Song, M.1    Clark, M.2
  • 8
    • 80053330055 scopus 로고    scopus 로고
    • CSAR benchmark exercise of 2010: Combined evaluation across all submitted scoring functions
    • Smith RD, Dunbar JB, Jr, Ung PM, et al. CSAR benchmark exercise of 2010: Combined evaluation across all submitted scoring functions. J Chem Info Model 2011;51: 2115-2131.
    • (2011) J Chem Info Model , vol.51 , pp. 2115-2131
    • Smith, R.D.1    Dunbar, J.B.2    Ung, P.M.3
  • 9
    • 80053386667 scopus 로고    scopus 로고
    • 2nd ed. Greenwich: Manning Publications
    • Ceder V. The Quick Python Book. 2nd ed. Greenwich: Manning Publications; 2010. p 400.
    • (2010) The Quick Python Book , pp. 400
    • Ceder, V.1
  • 13
    • 0034461768 scopus 로고    scopus 로고
    • Drug-like properties and the causes of poor solubility and poor permeability
    • Lipinski C. Drug-like properties and the causes of poor solubility and poor permeability. J Pharmacol Toxicol Methods 2000;44:235-249.
    • (2000) J Pharmacol Toxicol Methods , vol.44 , pp. 235-249
    • Lipinski, C.1
  • 14
    • 1542741028 scopus 로고    scopus 로고
    • ADME evaluation in drug discovery. 4. Prediction of aqueous solubility based on atom contribution approach
    • Hou TJ, Xia K, Zhang W, et al. ADME evaluation in drug discovery. 4. Prediction of aqueous solubility based on atom contribution approach. J Chem Info Model 2004;44:266-275.
    • (2004) J Chem Info Model , vol.44 , pp. 266-275
    • Hou, T.J.1    Xia, K.2    Zhang, W.3
  • 15
    • 0001645890 scopus 로고    scopus 로고
    • Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology
    • Huuskonen J. Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology. J Chem Info Model 2000;40:773-777.
    • (2000) J Chem Info Model , vol.40 , pp. 773-777
    • Huuskonen, J.1
  • 16
    • 0032061266 scopus 로고    scopus 로고
    • Aqueous solubility prediction of drugs based on molecular topology and neural network modeling
    • Huuskonen J, Salo M, Taskinen J. Aqueous solubility prediction of drugs based on molecular topology and neural network modeling. J Chem Info Model 1998;38:450-456.
    • (1998) J Chem Info Model , vol.38 , pp. 450-456
    • Huuskonen, J.1    Salo, M.2    Taskinen, J.3
  • 17
    • 0035526162 scopus 로고    scopus 로고
    • Estimation of aqueous solubility of chemical compounds using E-state indices
    • Tetko IV, Tanchuk VY, Kasheva TN, et al. Estimation of aqueous solubility of chemical compounds using E-state indices. J Chem Info Model 2001;41:1488-1493.
    • (2001) J Chem Info Model , vol.41 , pp. 1488-1493
    • Tetko, I.V.1    Tanchuk, V.Y.2    Kasheva, T.N.3
  • 18
    • 0026566715 scopus 로고
    • AQUAFAC 1: Aqueous functional group activity coefficients; application to hydrocarbons
    • Myrdal P, Ward GH, Dannenfelser RM, et al. AQUAFAC 1: Aqueous functional group activity coefficients; application to hydrocarbons. Chemosphere 1992;24:1047-1061.
    • (1992) Chemosphere , vol.24 , pp. 1047-1061
    • Myrdal, P.1    Ward, G.H.2    Dannenfelser, R.M.3
  • 19
    • 85016002253 scopus 로고    scopus 로고
    • May 14
    • http://www.pharmacy.arizona.edu/outreach/aquasol/index.html. Accessed 2013 May 14.
    • (2013)
  • 20
    • 85016008542 scopus 로고    scopus 로고
    • May 14
    • http://www.srcinc.com/what-we-do/product.aspx?id=133. Accessed 2013 May 14.
    • (2013)
  • 21
    • 49449113247 scopus 로고    scopus 로고
    • Solubility challenge: Can you predict solubilities of 32 molecules using a database of 100 reliable measurements?
    • Llinàs A, Glen RC, Goodman JM. Solubility challenge: Can you predict solubilities of 32 molecules using a database of 100 reliable measurements? J Chem Info Model 2008; 48:1289-1303.
    • (2008) J Chem Info Model , vol.48 , pp. 1289-1303
    • Llinàs, A.1    Glen, R.C.2    Goodman, J.M.3
  • 22
  • 23
    • 79959501616 scopus 로고    scopus 로고
    • Exploratory analysis of kinetic solubility measurements of a small molecule library
    • Guha R, Dexheimer TS, Kestranek AN, et al. Exploratory analysis of kinetic solubility measurements of a small molecule library. Bioorg Med Chem 2011;19:4127-4134.
    • (2011) Bioorg Med Chem , vol.19 , pp. 4127-4134
    • Guha, R.1    Dexheimer, T.S.2    Kestranek, A.N.3
  • 24
    • 85016031114 scopus 로고    scopus 로고
    • May 14
    • http://flowingdata.com/2008/02/15/how-to-read-and-use-a-box-and-whisker-plot/. Accessed 2013 May 14.
    • (2013)
  • 25
    • 85016064405 scopus 로고    scopus 로고
    • May 14
    • http://flowingdata.com/2012/05/15/how-to-visualize-and-compare-distributions/. Accessed 2013 May 14.
    • (2013)
  • 26
    • 85016011612 scopus 로고    scopus 로고
    • May 14
    • http://en.wikipedia.org/wiki/Box_plot. Accessed 2013 May 14.
    • (2013)
  • 29
    • 77956734967 scopus 로고    scopus 로고
    • Machine learning in computational chemistry
    • Goldman BB, Walters WP. Machine learning in computational chemistry. Annu Rep Comput Chem 2006;2:127-140.
    • (2006) Annu Rep Comput Chem , vol.2 , pp. 127-140
    • Goldman, B.B.1    Walters, W.P.2
  • 30
    • 85016017748 scopus 로고    scopus 로고
    • May 14
    • http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient. Accessed 2013 May 14.
    • (2013)
  • 31
    • 0004285343 scopus 로고    scopus 로고
    • New York: McGraw-Hill Medical
    • Glantz, S. Primer of Biostatistics. New York: McGraw-Hill Medical; 2011. p 320.
    • (2011) Primer of Biostatistics , pp. 320
    • Glantz, S.1
  • 32
    • 85016023906 scopus 로고    scopus 로고
    • May 14
    • http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient. Accessed 2013 May 14.
    • (2013)
  • 33
    • 85016018928 scopus 로고    scopus 로고
    • May 14
    • http://en.wikipedia.org/wiki/Root-mean-square_deviation. Accessed 2013 May 14.
    • (2013)
  • 34
    • 79955016724 scopus 로고    scopus 로고
    • Making sure there's a "give" associated with the "take": Producing and using open-source software in big pharma
    • Landrum G, Lewis R, Palmer A, et al. Making sure there's a "give" associated with the "take": Producing and using open-source software in big pharma. J Cheminform 2011;3:1-1.
    • (2011) J Cheminform , vol.3 , pp. 1-1
    • Landrum, G.1    Lewis, R.2    Palmer, A.3
  • 35
    • 85016051706 scopus 로고    scopus 로고
    • May 14
    • http://www.rdkit.org/. Accessed 2013 May 14.
    • (2013)
  • 36
    • 0035478854 scopus 로고    scopus 로고
    • Random forests
    • Breiman L. Random forests. Mach Learn 2001;45:5-32.
    • (2001) Mach Learn , vol.45 , pp. 5-32
    • Breiman, L.1
  • 37
    • 0345040873 scopus 로고    scopus 로고
    • Classification and regression by random forest
    • Liaw A, Wiener M. Classification and regression by random forest. R News 2002;2:18-22.
    • (2002) R News , vol.2 , pp. 18-22
    • Liaw, A.1    Wiener, M.2
  • 38
    • 0345548657 scopus 로고    scopus 로고
    • Random forest: A classification and regression tool for compound classification and QSAR modeling
    • Svetnik V, Liaw A, Tong C, et al. Random forest: A classification and regression tool for compound classification and QSAR modeling. J Chem Info Comput Sci 2003;43:1947-1958.
    • (2003) J Chem Info Comput Sci , vol.43 , pp. 1947-1958
    • Svetnik, V.1    Liaw, A.2    Tong, C.3
  • 39
    • 54349105915 scopus 로고    scopus 로고
    • Prediction of human intestinal absorption by GA feature selection and support vector machine regression
    • Yan A, Wang Z, Cai Z. Prediction of human intestinal absorption by GA feature selection and support vector machine regression. Int J Mol Sci 2008;9:1961-1976.
    • (2008) Int J Mol Sci , vol.9 , pp. 1961-1976
    • Yan, A.1    Wang, Z.2    Cai, Z.3
  • 40
    • 0037361981 scopus 로고    scopus 로고
    • Prediction of aqueous solubility of organic compounds based on a 3D structure representation
    • Yan A, Gasteiger J. Prediction of aqueous solubility of organic compounds based on a 3D structure representation. J Chem Info Model 2003;43:429-434.
    • (2003) J Chem Info Model , vol.43 , pp. 429-434
    • Yan, A.1    Gasteiger, J.2
  • 41
    • 62849109096 scopus 로고    scopus 로고
    • Healthy skepticism: Assessing realistic model performance
    • Brown SP, Muchmore SW, Hajduk PJ. Healthy skepticism: Assessing realistic model performance. Drug Discov Today 2009;14:420-427.
    • (2009) Drug Discov Today , vol.14 , pp. 420-427
    • Brown, S.P.1    Muchmore, S.W.2    Hajduk, P.J.3
  • 42
    • 10044263240 scopus 로고    scopus 로고
    • Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR
    • Sheridan RP, Feuston BP, Maiorov VN, et al. Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR. J Chem Info Comput Sci 2004;44:1912-1928.
    • (2004) J Chem Info Comput Sci , vol.44 , pp. 1912-1928
    • Sheridan, R.P.1    Feuston, B.P.2    Maiorov, V.N.3
  • 43
    • 33750336399 scopus 로고    scopus 로고
    • Lingos, finite state machines, and fast similarity searching
    • Grant J, Haigh J, Pickup B, et al. Lingos, finite state machines, and fast similarity searching. J Chem Inf Model 2006; 46:1912-1918.
    • (2006) J Chem Inf Model , vol.46 , pp. 1912-1918
    • Grant, J.1    Haigh, J.2    Pickup, B.3
  • 44
    • 33646249968 scopus 로고    scopus 로고
    • New methods for ligand-based virtual screening: Use of data fusion and machine learning to enhance the effectiveness of similarity searching
    • Hert J, Willett P, Wilton DJ, et al. New methods for ligand-based virtual screening: Use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Info Model 2006;46:462-470.
    • (2006) J Chem Info Model , vol.46 , pp. 462-470
    • Hert, J.1    Willett, P.2    Wilton, D.J.3
  • 45
    • 44449162145 scopus 로고    scopus 로고
    • A comparison of field-based similarity searching methods: CatShape, FBSS, and ROCS
    • Moffat K, Gillet VJ, Whittle M, et al. A comparison of field-based similarity searching methods: CatShape, FBSS, and ROCS. J Chem Info Model 2008;48:719-729.
    • (2008) J Chem Info Model , vol.48 , pp. 719-729
    • Moffat, K.1    Gillet, V.J.2    Whittle, M.3
  • 46
    • 45749116266 scopus 로고    scopus 로고
    • Application of belief theory to similarity data fusion for use in analog searching and lead hopping
    • Muchmore SW, Debe DA, Metz JT, et al. Application of belief theory to similarity data fusion for use in analog searching and lead hopping. J Chem Info Model 2008;48:941-948.
    • (2008) J Chem Info Model , vol.48 , pp. 941-948
    • Muchmore, S.W.1    Debe, D.A.2    Metz, J.T.3
  • 47
    • 61949166066 scopus 로고    scopus 로고
    • How similar are similarity searching methods? A principal component analysis of molecular descriptor space
    • Bender A, Jenkins JL, Scheiber J, et al. How similar are similarity searching methods? A principal component analysis of molecular descriptor space. J Chem Info Model 2009;49:108-119.
    • (2009) J Chem Info Model , vol.49 , pp. 108-119
    • Bender, A.1    Jenkins, J.L.2    Scheiber, J.3
  • 48
    • 77952716960 scopus 로고    scopus 로고
    • Molecular shape and medicinal chemistry: A perspective
    • Nicholls A, McGaughey GB, Sheridan RP, et al. Molecular shape and medicinal chemistry: A perspective. J Med Chem 2010;53:3862.
    • (2010) J Med Chem , vol.53 , pp. 3862
    • Nicholls, A.1    McGaughey, G.B.2    Sheridan, R.P.3
  • 50
    • 41349093326 scopus 로고    scopus 로고
    • What do we know and when do we know it?
    • Nicholls A. What do we know and when do we know it? J Comput Aided Mol Des 2008;22:239-255.
    • (2008) J Comput Aided Mol Des , vol.22 , pp. 239-255
    • Nicholls, A.1
  • 51
    • 41349106542 scopus 로고    scopus 로고
    • Recommendations for evaluation of computational methods
    • Jain AN, Nicholls A. Recommendations for evaluation of computational methods. J Comput Aided Mol Des 2008;22:133-139.
    • (2008) J Comput Aided Mol Des , vol.22 , pp. 133-139
    • Jain, A.N.1    Nicholls, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.