메뉴 건너뛰기




Volumn 17, Issue 1, 2016, Pages

Finding an appropriate equation to measure similarity between binary vectors: Case studies on Indonesian and Japanese herbal medicines

Author keywords

Binary data; Distance metric; Hierarchical clustering; Jamu; Kampo; ROC curve; Similarity measures

Indexed keywords

DATA HANDLING; MEDICINE; MULTIVARIANT ANALYSIS;

EID: 85002554660     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/s12859-016-1392-z     Document Type: Article
Times cited : (26)

References (73)
  • 3
    • 21844514245 scopus 로고
    • Comparing resemblance measures
    • Batagelj V, Bren M. Comparing resemblance measures. J Classif. 1995;12:73-90.
    • (1995) J Classif , vol.12 , pp. 73-90
    • Batagelj, V.1    Bren, M.2
  • 4
    • 79951759653 scopus 로고    scopus 로고
    • System biology approach for elucidating the relationship between Indonesian herbal plants and the efficacy of Jamu
    • In Proceedings - IEEE International Conference on Data Mining, ICDM
    • Afendi FM, Darusman LK, Hirai A, Altaf-Ul-Amin M, Takahashi H, Nakamura K, Kanaya S: System biology approach for elucidating the relationship between Indonesian herbal plants and the efficacy of Jamu. In Proceedings - IEEE International Conference on Data Mining, ICDM. IEEE; 2010:661-668.
    • (2010) IEEE , pp. 661-668
    • Afendi, F.M.1    Darusman, L.K.2    Hirai, A.3    Altaf-Ul-Amin, M.4    Takahashi, H.5    Nakamura, K.6    Kanaya, S.7
  • 6
    • 84934439821 scopus 로고    scopus 로고
    • Molecular similarity concepts and search calculations
    • In: Keith JM, editor. Bioinformatics volume II: Structure, function and applications (Methods in molecular biology), vol
    • Auer J, Bajorath J. Molecular similarity concepts and search calculations. In: Keith JM, editor. Bioinformatics volume II: Structure, function and applications (Methods in molecular biology), vol. 453. Totowa: Humana Press; 2008. p. 327-47.
    • (2008) Totowa: Humana Press , vol.453 , pp. 327-347
    • Auer, J.1    Bajorath, J.2
  • 10
    • 84964654330 scopus 로고    scopus 로고
    • Computational algorithms to predict Gene Ontology annotations
    • Pinoli P, Chicco D, Masseroli M. Computational algorithms to predict Gene Ontology annotations. BMC Bioinformatics. 2015;16 Suppl 6:1-15.
    • (2015) BMC Bioinformatics , vol.16 , pp. 1-15
    • Pinoli, P.1    Chicco, D.2    Masseroli, M.3
  • 11
    • 84901302727 scopus 로고    scopus 로고
    • Efficient discovery of responses of proteins to compounds using active learning
    • Kangas JD, Naik AW, Murphy RF. Efficient discovery of responses of proteins to compounds using active learning. BMC Bioinformatics. 2014;15:1-11.
    • (2014) BMC Bioinformatics , vol.15 , pp. 1-11
    • Kangas, J.D.1    Naik, A.W.2    Murphy, R.F.3
  • 15
    • 0001232509 scopus 로고    scopus 로고
    • On the properties of bit string-based measures of chemical similarity
    • Flower DR. On the properties of bit string-based measures of chemical similarity. J Chem Inf Model. 1998;38:379-86.
    • (1998) J Chem Inf Model , vol.38 , pp. 379-386
    • Flower, D.R.1
  • 16
    • 0001535816 scopus 로고    scopus 로고
    • Combinatorial preferences affect molecular similarity/diversity calculations using binary fingerprints and Tanimoto coefficients
    • Godden JW, Xue L, Bajorath J. Combinatorial preferences affect molecular similarity/diversity calculations using binary fingerprints and Tanimoto coefficients. J Chem Inf Model. 2000;40:163-6.
    • (2000) J Chem Inf Model , vol.40 , pp. 163-166
    • Godden, J.W.1    Xue, L.2    Bajorath, J.3
  • 17
    • 0035871891 scopus 로고    scopus 로고
    • Multidimensional scaling and visualization of large molecular similarity tables
    • Agrafiotis DK, Rassokhin DN, Lobanov VS. Multidimensional scaling and visualization of large molecular similarity tables. J Comput Chem. 2001;22:488-500.
    • (2001) J Comput Chem , vol.22 , pp. 488-500
    • Agrafiotis, D.K.1    Rassokhin, D.N.2    Lobanov, V.S.3
  • 19
    • 0036567220 scopus 로고    scopus 로고
    • A modification of the Jaccard-Tanimoto similarity index for diverse selection of chemical compounds using binary strings
    • Fligner MA, Verducci JS, Blower PE. A modification of the Jaccard-Tanimoto similarity index for diverse selection of chemical compounds using binary strings. Technometrics. 2002;44:110-9.
    • (2002) Technometrics , vol.44 , pp. 110-119
    • Fligner, M.A.1    Verducci, J.S.2    Blower, P.E.3
  • 20
    • 0038056137 scopus 로고    scopus 로고
    • Binary vector dissimilarity measures for handwriting identification
    • In: Proceedings of SPIE-IS&
    • Zhang B, Srihari SN. Binary vector dissimilarity measures for handwriting identification. In: Proceedings of SPIE-IS&T Electronic Imaging, vol. 5010. 2003. p. 28-38.
    • (2003) T Electronic Imaging, vol. 5010 , pp. 28-38
    • Zhang, B.1    Srihari, S.N.2
  • 22
    • 13444282600 scopus 로고    scopus 로고
    • Similarity coefficients for molecular markers in studies of genetic relationships between individuals for haploid, diploid, and polyploid species
    • Kosman E, Leonard KJ. Similarity coefficients for molecular markers in studies of genetic relationships between individuals for haploid, diploid, and polyploid species. Mol Ecol. 2005;14(2):415-24.
    • (2005) Mol Ecol , vol.14 , Issue.2 , pp. 415-424
    • Kosman, E.1    Leonard, K.J.2
  • 24
    • 84870022937 scopus 로고    scopus 로고
    • Similarity coefficients for binary chemoinformatics data: Overview and extended comparison using simulated and real data sets
    • Todeschini R, Consonni V, Xiang H, Holliday J, Buscema M, Willett P. Similarity coefficients for binary chemoinformatics data: Overview and extended comparison using simulated and real data sets. J Chem Inf Model. 2012;52:2884-901.
    • (2012) J Chem Inf Model , vol.52 , pp. 2884-2901
    • Todeschini, R.1    Consonni, V.2    Xiang, H.3    Holliday, J.4    Buscema, M.5    Willett, P.6
  • 26
    • 85002750024 scopus 로고    scopus 로고
    • Accessed 19 Aug
    • Seminar nasional dan pameran industri Jamu [ http://seminar.ift.or.id/seminar-jamu-brand-indonesia/ ]. Accessed 19 Aug 2014.
  • 29
    • 2342577533 scopus 로고    scopus 로고
    • Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L)
    • da Silva MA, Garcia AAF, Pereira de Souza A, Lopes de Souza C. Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L). Genet Mol Biol. 2004;27:83-91.
    • (2004) Genet Mol Biol , vol.27 , pp. 83-91
    • Silva, M.A.1    Garcia, A.A.F.2    Pereira de Souza, A.3    Lopes de Souza, C.4
  • 30
    • 29644438050 scopus 로고    scopus 로고
    • Statistical comparisons of classifiers over multiple data sets
    • Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res. 2006;7:1-30.
    • (2006) J Mach Learn Res , vol.7 , pp. 1-30
    • Demšar, J.1
  • 31
    • 0034274591 scopus 로고    scopus 로고
    • A comparison of prediction accuracy, complexity, and training time of thirty three old and new classification algorithms
    • Lim T, Loh W, Shih Y. A comparison of prediction accuracy, complexity, and training time of thirty three old and new classification algorithms. Mach Learn. 2000;40:203-29.
    • (2000) Mach Learn , vol.40 , pp. 203-229
    • Lim, T.1    Loh, W.2    Shih, Y.3
  • 32
    • 0018079655 scopus 로고
    • Basic principles of ROC analysis
    • Metz CE. Basic principles of ROC analysis. Semin Nucl Med. 1978;8:283-98.
    • (1978) Semin Nucl Med , vol.8 , pp. 283-298
    • Metz, C.E.1
  • 34
    • 36348988873 scopus 로고    scopus 로고
    • Foundations of statistical natural language processing
    • Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge: MITpress; 1999.
    • (1999) Cambridge: MITpress
    • Manning, C.D.1    Schütze, H.2
  • 35
    • 84973587732 scopus 로고
    • A coefficient of agreement for nominal scales
    • Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37-46.
    • (1960) Educ Psychol Meas , vol.20 , pp. 37-46
    • Cohen, J.1
  • 36
    • 34548677479 scopus 로고    scopus 로고
    • A lot of randomness is hiding in accuracy
    • Ben-David A. A lot of randomness is hiding in accuracy. Eng Appl Artif Intell. 2007;20:875-85.
    • (2007) Eng Appl Artif Intell , vol.20 , pp. 875-885
    • Ben-David, A.1
  • 37
    • 46149086179 scopus 로고    scopus 로고
    • About the relationship between ROC curves and Cohen's kappa
    • Ben-David A. About the relationship between ROC curves and Cohen's kappa. Eng Appl Artif Intell. 2008;21:874-82.
    • (2008) Eng Appl Artif Intell , vol.21 , pp. 874-882
    • Ben-David, A.1
  • 38
    • 85002850973 scopus 로고    scopus 로고
    • Accessed 20 May
    • Genes and diseases [ http://www.ncbi.nlm.nih.gov/books/NBK22185/ ]. Accessed 20 May 2016.
    • (2016)
  • 41
    • 0024880562 scopus 로고
    • Similarity coefficients: Measures of co-occurrence and association or simply measures of occurrence?
    • Jackson DA, Somers KM, Harvey HH. Similarity coefficients: Measures of co-occurrence and association or simply measures of occurrence? Am Nat. 1989;133:436-53.
    • (1989) Am Nat , vol.133 , pp. 436-453
    • Jackson, D.A.1    Somers, K.M.2    Harvey, H.H.3
  • 42
    • 75649088900 scopus 로고    scopus 로고
    • Comparison of similarity coefficients used for cluster analysis with amplified fragment length polymorphism markers in the silkworm, Bombyx mori
    • Dalirsefat SB, da Silva MA, Mirhoseini SZ. Comparison of similarity coefficients used for cluster analysis with amplified fragment length polymorphism markers in the silkworm, Bombyx mori. J Insect Sci. 2009;9:1-8.
    • (2009) J Insect Sci , vol.9 , pp. 1-8
    • Dalirsefat, S.B.1    Silva, M.A.2    Mirhoseini, S.Z.3
  • 43
    • 84980090975 scopus 로고
    • The distribution of the flora in the alpine zone
    • Jaccard P. The distribution of the flora in the alpine zone. New Phytol. 1912;11:37-50.
    • (1912) New Phytol , vol.11 , pp. 37-50
    • Jaccard, P.1
  • 44
    • 0000250265 scopus 로고
    • Measures of the amount of ecologic association between species
    • Dice LR. Measures of the amount of ecologic association between species. Ecology. 1945;26:297-302.
    • (1945) Ecology , vol.26 , pp. 297-302
    • Dice, L.R.1
  • 45
    • 0020451042 scopus 로고
    • Coefficients of association and similarity, based on binary (presence-absence) data: An evaluation
    • Hubalek Z. Coefficients of association and similarity, based on binary (presence-absence) data: An evaluation. Biol Rev. 1982;57:669-89.
    • (1982) Biol Rev , vol.57 , pp. 669-689
    • Hubalek, Z.1
  • 46
  • 47
    • 85002620272 scopus 로고    scopus 로고
    • Anomaly between Jaccard and Tanimoto coefficients
    • In: Proceedings of Student-Faculty Research Day, CSIS, Pace University
    • Cha S, Choi S, Tappert C. Anomaly between Jaccard and Tanimoto coefficients. In: Proceedings of Student-Faculty Research Day, CSIS, Pace University. 2009. p. 1-8.
    • (2009) , pp. 1-8
    • Cha, S.1    Choi, S.2    Tappert, C.3
  • 48
    • 84867357630 scopus 로고    scopus 로고
    • Enhancing Binary Feature Vector Similarity Measures
    • Cha S-H, Tappert CC, Yoon S. Enhancing Binary Feature Vector Similarity Measures. 2005.
    • (2005)
    • Cha, S-H.1    Tappert, C.C.2    Yoon, S.3
  • 49
    • 56149125428 scopus 로고    scopus 로고
    • Binary-Based Similarity Measures for Categorical Data and Their Application in Self-Organizing Maps
    • Lourenco F, Lobo V, Bacao F. Binary-Based Similarity Measures for Categorical Data and Their Application in Self-Organizing Maps. 2004.
    • (2004)
    • Lourenco, F.1    Lobo, V.2    Bacao, F.3
  • 50
    • 85052042694 scopus 로고    scopus 로고
    • Comparison of different proximity measures and classification methods for binary data. Faculty of Agricultural Sciences, Nutritional Sciences and Environmental Management
    • Ojurongbe TA. Comparison of different proximity measures and classification methods for binary data. Faculty of Agricultural Sciences, Nutritional Sciences and Environmental Management, Justus Liebig University Gießen; 2012.
    • (2012) Justus Liebig University Gießen
    • Ojurongbe, T.A.1
  • 51
    • 0014129195 scopus 로고
    • Hierarchical clustering schemes
    • Johnson SC. Hierarchical clustering schemes. Psychometrika. 1967;32:241-54.
    • (1967) Psychometrika , vol.32 , pp. 241-254
    • Johnson, S.C.1
  • 52
    • 0006041829 scopus 로고
    • Marine ecology and the coefficient of association: A plea in behalf of quantitative biology
    • Michael EL. Marine ecology and the coefficient of association: A plea in behalf of quantitative biology. J Ecol. 1920;8:54-9.
    • (1920) J Ecol , vol.8 , pp. 54-59
    • Michael, E.L.1
  • 53
    • 0343178427 scopus 로고
    • The association factor in information retrieval
    • Stiles HE. The association factor in information retrieval. J ACM. 1961;8(2):271-9.
    • (1961) J ACM , vol.8 , Issue.2 , pp. 271-279
    • Stiles, H.E.1
  • 54
    • 0005814023 scopus 로고
    • Mathematical model for studying genetic variation in terms of restriction endonucleases
    • Nei M, Li W-H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci U S A. 1979;76:5269-73.
    • (1979) Proc Natl Acad Sci U S A , vol.76 , pp. 5269-5273
    • Nei, M.1    Li, W.-H.2
  • 55
    • 0036249270 scopus 로고    scopus 로고
    • Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings
    • Holliday JD, Hu C-Y, Willett P. Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Comb Chem High Throughput Screen. 2002;5:155-66.
    • (2002) Comb Chem High Throughput Screen , vol.5 , pp. 155-166
    • Holliday, J.D.1    Hu, C.-Y.2    Willett, P.3
  • 56
    • 0035655185 scopus 로고    scopus 로고
    • Choosing the best similarity index when performing fuzzy set ordination on binary data
    • Boyce RL, Ellison PC. Choosing the best similarity index when performing fuzzy set ordination on binary data. J Veg Sci. 2001;12:711-20.
    • (2001) J Veg Sci , vol.12 , pp. 711-720
    • Boyce, R.L.1    Ellison, P.C.2
  • 57
    • 0020895473 scopus 로고
    • Asymmetric binary similarity measures
    • Faith DP. Asymmetric binary similarity measures. Oecologia. 1983;57:287-90.
    • (1983) Oecologia , vol.57 , pp. 287-290
    • Faith, D.P.1
  • 58
    • 33747044600 scopus 로고
    • Metric and Euclidean properties of dissimilarity coefficients
    • Gower JC, Legendre P. Metric and Euclidean properties of dissimilarity coefficients. J Classif. 1986;3:5-48.
    • (1986) J Classif , vol.3 , pp. 5-48
    • Gower, J.C.1    Legendre, P.2
  • 59
    • 0037396953 scopus 로고    scopus 로고
    • Distance-preserving mappings from binary vectors to permutations
    • Chang J, Chen R, Tsai S. Distance-preserving mappings from binary vectors to permutations. IEEE Trans Inf Theory. 2003;49:1054-9.
    • (2003) IEEE Trans Inf Theory , vol.49 , pp. 1054-1059
    • Chang, J.1    Chen, R.2    Tsai, S.3
  • 60
    • 0001939623 scopus 로고
    • Computer Programs for Hierarchical Polythetic Classification ("Similarity Analyses")
    • Lance GN, Williams WT. Computer Programs for Hierarchical Polythetic Classification ("Similarity Analyses"). Comput J. 1966;9:60-4.
    • (1966) Comput J , vol.9 , pp. 60-64
    • Lance, G.N.1    Williams, W.T.2
  • 63
    • 0031192257 scopus 로고    scopus 로고
    • Clustering by competitive agglomeration
    • Frigui H, Krishnapuram R. Clustering by competitive agglomeration. Pattern Recognit. 1997;30:1109-19.
    • (1997) Pattern Recognit , vol.30 , pp. 1109-1119
    • Frigui, H.1    Krishnapuram, R.2
  • 65
    • 0037399775 scopus 로고    scopus 로고
    • Cluster validation techniques for genome expression data
    • Bolshakova N, Azuaje F. Cluster validation techniques for genome expression data. Signal Process. 2003;83:825-33.
    • (2003) Signal Process , vol.83 , pp. 825-833
    • Bolshakova, N.1    Azuaje, F.2
  • 66
    • 80054707993 scopus 로고    scopus 로고
    • Hierarchical clustering with prototypes via minimax linkage
    • Bien J, Tibshirani R. Hierarchical clustering with prototypes via minimax linkage. J Am Stat Assoc. 2011;106(495):1075-84.
    • (2011) J Am Stat Assoc , vol.106 , Issue.495 , pp. 1075-1084
    • Bien, J.1    Tibshirani, R.2
  • 67
    • 43149103140 scopus 로고    scopus 로고
    • ROC analysis: Applications to the classification of biological sequences and 3D structures
    • Sonego P, Kocsor A, Pongor S. ROC analysis: Applications to the classification of biological sequences and 3D structures. Brief Bioinform. 2008;9:198-209.
    • (2008) Brief Bioinform , vol.9 , pp. 198-209
    • Sonego, P.1    Kocsor, A.2    Pongor, S.3
  • 68
    • 33646023117 scopus 로고    scopus 로고
    • An introduction to ROC analysis
    • Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett. 2006;27:861-74.
    • (2006) Pattern Recognit Lett , vol.27 , pp. 861-874
    • Fawcett, T.1
  • 69
    • 54349096842 scopus 로고    scopus 로고
    • Modifying the DPClus algorithm for identifying protein complexes based on new topological structures
    • Li M, Chen J, Wang J, Hu B, Chen G. Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinformatics. 2008;9:1-16.
    • (2008) BMC Bioinformatics , vol.9 , pp. 1-16
    • Li, M.1    Chen, J.2    Wang, J.3    Hu, B.4    Chen, G.5
  • 73
    • 34250173088 scopus 로고    scopus 로고
    • Investigating diversity of clustering methods: an empirical comparison
    • Gelbard R, Goldman O, Spiegler I. Investigating diversity of clustering methods: an empirical comparison. Data Knowl Eng. 2007;63:155-66.
    • (2007) Data Knowl Eng , vol.63 , pp. 155-166
    • Gelbard, R.1    Goldman, O.2    Spiegler, I.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.