메뉴 건너뛰기




Volumn 40, Issue 1, 2013, Pages 200-210

A comparative study of efficient initialization methods for the k-means clustering algorithm

Author keywords

Cluster center initialization; k means; Partitional clustering; Sum of squared error criterion

Indexed keywords

CLUSTER CENTERS; COMPARATIVE STUDIES; DATA SETS; GRADIENT DESCENT; INITIALIZATION METHODS; K-MEANS; K-MEANS CLUSTERING ALGORITHM; LINEAR TIME COMPLEXITY; NON-PARAMETRIC STATISTICAL TESTS; PARTITIONAL CLUSTERING; PARTITIONAL CLUSTERING ALGORITHM; PERFORMANCE CRITERION; SUM OF SQUARED ERRORS;

EID: 84866127615     PISSN: 09574174     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.eswa.2012.07.021     Document Type: Article
Times cited : (1002)

References (76)
  • 2
    • 0030303747 scopus 로고    scopus 로고
    • New methods for the initialisation of clusters
    • M. Al-Daoud, and S. Roberts New methods for the initialisation of clusters Pattern Recognition Letters 17 5 1996 451 455
    • (1996) Pattern Recognition Letters , vol.17 , Issue.5 , pp. 451-455
    • Al-Daoud, M.1    Roberts, S.2
  • 3
    • 67649088034 scopus 로고    scopus 로고
    • Robust partitional clustering by outlier and density insensitive seeding
    • M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki Robust partitional clustering by outlier and density insensitive seeding Pattern Recognition Letters 30 11 2009 994 1002
    • (2009) Pattern Recognition Letters , vol.30 , Issue.11 , pp. 994-1002
    • Al Hasan, M.1    Chaoji, V.2    Salem, S.3    Zaki, M.4
  • 4
    • 62249143532 scopus 로고    scopus 로고
    • NP-hardness of euclidean sum-of-squares clustering
    • D. Aloise, A. Deshpande, P. Hansen, and P. Popat NP-hardness of euclidean sum-of-squares clustering Machine Learning 75 2 2009 245 248
    • (2009) Machine Learning , vol.75 , Issue.2 , pp. 245-248
    • Aloise, D.1    Deshpande, A.2    Hansen, P.3    Popat, P.4
  • 5
    • 84857630722 scopus 로고    scopus 로고
    • An improved column generation algorithm for minimum sum-of-squares clustering
    • D. Aloise, P. Hansen, and L. Liberti An improved column generation algorithm for minimum sum-of-squares clustering Mathematical Programming 2010 1 26
    • (2010) Mathematical Programming , pp. 1-26
    • Aloise, D.1    Hansen, P.2    Liberti, L.3
  • 9
    • 38049103082 scopus 로고
    • A near-optimal initial seed value selection in k-means algorithm using a genetic algorithm
    • G.P. Babu, and M.N. Murty A near-optimal initial seed value selection in k-means algorithm using a genetic algorithm Pattern Recognition Letters 14 10 1993 763 769
    • (1993) Pattern Recognition Letters , vol.14 , Issue.10 , pp. 763-769
    • Babu, G.P.1    Murty, M.N.2
  • 10
    • 33747785233 scopus 로고
    • Simulated annealing for selecting optimal initial seeds in the k-means algorithm
    • G. Babu, and M. Murty Simulated annealing for selecting optimal initial seeds in the k-means algorithm Indian Journal of Pure and Applied Mathematics 25 1-2 1994 85 94
    • (1994) Indian Journal of Pure and Applied Mathematics , vol.25 , Issue.12 , pp. 85-94
    • Babu, G.1    Murty, M.2
  • 11
    • 0014060964 scopus 로고
    • A clustering technique for summarizing multivariate data
    • G.H. Ball, and D.J. Hall A clustering technique for summarizing multivariate data Behavioral Science 12 2 1967 153 155
    • (1967) Behavioral Science , vol.12 , Issue.2 , pp. 153-155
    • Ball, G.H.1    Hall, D.J.2
  • 12
    • 0022136084 scopus 로고
    • An improvement of the minimum distortion encoding algorithm for vector quantization
    • C.D. Bei, and R.M. Gray An improvement of the minimum distortion encoding algorithm for vector quantization IEEE Transactions on Communications 33 10 1985 1132 1133
    • (1985) IEEE Transactions on Communications , vol.33 , Issue.10 , pp. 1132-1133
    • Bei, C.D.1    Gray, R.M.2
  • 17
    • 67349174255 scopus 로고    scopus 로고
    • An initialization method for the k-means algorithm using neighborhood model
    • F. Cao, J. Liang, and G. Jiang An initialization method for the k-means algorithm using neighborhood model Computers and Mathematics with Applications 58 3 2009 474 483
    • (2009) Computers and Mathematics with Applications , vol.58 , Issue.3 , pp. 474-483
    • Cao, F.1    Liang, J.2    Jiang, G.3
  • 18
    • 78650331556 scopus 로고    scopus 로고
    • Improving the performance of k-means for color quantization
    • M.E. Celebi Improving the performance of k-means for color quantization Image and Vision Computing 29 4 2011 260 271
    • (2011) Image and Vision Computing , vol.29 , Issue.4 , pp. 260-271
    • Celebi, M.E.1
  • 21
    • 84866132611 scopus 로고    scopus 로고
    • Georgia Institute of Technology
    • Cook, W. (2011). World TSP, Georgia Institute of Technology, .
    • (2011) World TSP
    • Cook, W.1
  • 24
    • 0000014486 scopus 로고
    • Cluster analysis of multivariate data: Efficiency vs. interpretability of classification
    • E. Forgy Cluster analysis of multivariate data: Efficiency vs. interpretability of classification Biometrics 21 1965 768
    • (1965) Biometrics , vol.21 , pp. 768
    • Forgy, E.1
  • 25
    • 78649934709 scopus 로고    scopus 로고
    • University of California, Irvine, School of Information and Computer Sciences
    • Frank, A., & Asuncion, A. (2011). UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences, .
    • (2011) UCI Machine Learning Repository
    • Frank, A.1    Asuncion, A.2
  • 26
    • 84944811700 scopus 로고
    • The use of ranks to avoid the assumption of normality implicit in the analysis of variance
    • M. Friedman The use of ranks to avoid the assumption of normality implicit in the analysis of variance Journal of the American Statistical Association 32 200 1937 675 701
    • (1937) Journal of the American Statistical Association , vol.32 , Issue.200 , pp. 675-701
    • Friedman, M.1
  • 27
    • 64549120231 scopus 로고    scopus 로고
    • A study of statistical techniques and performance measures for genetics-based machine learning: Accuracy and interpretability
    • S. Garcia, A. Fernandez, J. Luengo, and F. Herrera A study of statistical techniques and performance measures for genetics-based machine learning: Accuracy and interpretability Soft Computing 13 10 2009 959 977
    • (2009) Soft Computing , vol.13 , Issue.10 , pp. 959-977
    • Garcia, S.1    Fernandez, A.2    Luengo, J.3    Herrera, F.4
  • 28
    • 58149287952 scopus 로고    scopus 로고
    • An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons
    • S. Garcia, and F. Herrera An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons Journal of Machine Learning Research 9 2008 2677 2694
    • (2008) Journal of Machine Learning Research , vol.9 , pp. 2677-2694
    • Garcia, S.1    Herrera, F.2
  • 29
    • 0021938963 scopus 로고
    • Clustering to minimize the maximum intercluster distance
    • T. Gonzalez Clustering to minimize the maximum intercluster distance Theoretical Computer Science 38 2-3 1985 293 306
    • (1985) Theoretical Computer Science , vol.38 , Issue.23 , pp. 293-306
    • Gonzalez, T.1
  • 33
    • 0002467254 scopus 로고
    • Simplified calculation of principal components
    • H. Hotelling Simplified calculation of principal components Psychometrika 1 1 1936 27 35
    • (1936) Psychometrika , vol.1 , Issue.1 , pp. 27-35
    • Hotelling, H.1
  • 34
    • 0025488430 scopus 로고
    • Fast encoding algorithm for VQ-based image coding
    • S.H. Huang, and S.H. Chen Fast encoding algorithm for VQ-based image coding Electronics Letters 26 19 1990 1618 1619
    • (1990) Electronics Letters , vol.26 , Issue.19 , pp. 1618-1619
    • Huang, S.H.1    Chen, S.H.2
  • 35
    • 0027271777 scopus 로고
    • A comparison of several vector quantization codebook generation approaches
    • C.M. Huang, and R.W. Harris A comparison of several vector quantization codebook generation approaches IEEE Transactions on Image Processing 2 1 1993 108 112
    • (1993) IEEE Transactions on Image Processing , vol.2 , Issue.1 , pp. 108-112
    • Huang, C.M.1    Harris, R.W.2
  • 37
    • 0032629347 scopus 로고    scopus 로고
    • Fast and robust fixed-point algorithms for independent component analysis
    • A. Hyvärinen Fast and robust fixed-point algorithms for independent component analysis IEEE Transactions on Neural Networks 10 3 1999 626 634
    • (1999) IEEE Transactions on Neural Networks , vol.10 , Issue.3 , pp. 626-634
    • Hyvärinen, A.1
  • 39
    • 77950369345 scopus 로고    scopus 로고
    • Data clustering: 50 years beyond k-means
    • A.K. Jain Data clustering: 50 years beyond k-means Pattern Recognition Letters 31 8 2010 651 666
    • (2010) Pattern Recognition Letters , vol.31 , Issue.8 , pp. 651-666
    • Jain, A.K.1
  • 41
    • 84970541720 scopus 로고
    • Multidimensional group analysis
    • R.C. Jancey Multidimensional group analysis Australian Journal of Botany 14 1 1966 127 130
    • (1966) Australian Journal of Botany , vol.14 , Issue.1 , pp. 127-130
    • Jancey, R.C.1
  • 45
    • 0003126321 scopus 로고
    • A general theory of classificatory sorting strategies - II. Clustering systems
    • G.N. Lance, and W.T. Williams A general theory of classificatory sorting strategies - II. Clustering systems The Computer Journal 10 3 1967 271 277
    • (1967) The Computer Journal , vol.10 , Issue.3 , pp. 271-277
    • Lance, G.N.1    Williams, W.T.2
  • 46
    • 0036487280 scopus 로고    scopus 로고
    • The global k-means clustering algorithm
    • A. Likas, N. Vlassis, and J. Verbeek The global k-means clustering algorithm Pattern Recognition 36 2 2003 451 461
    • (2003) Pattern Recognition , vol.36 , Issue.2 , pp. 451-461
    • Likas, A.1    Vlassis, N.2    Verbeek, J.3
  • 49
    • 60249094201 scopus 로고    scopus 로고
    • A study on the use of statistical tests for experimentation with neural networks: Analysis of parametric test conditions and non-parametric tests
    • J. Luengo, S. Garcia, and F. Herrera A study on the use of statistical tests for experimentation with neural networks: Analysis of parametric test conditions and non-parametric tests Expert Systems with Applications 36 4 2009 7798 7808
    • (2009) Expert Systems with Applications , vol.36 , Issue.4 , pp. 7798-7808
    • Luengo, J.1    Garcia, S.2    Herrera, F.3
  • 50
    • 39949083377 scopus 로고    scopus 로고
    • Hierarchical initialization approach for k-means clustering
    • J.F. Lu, J.B. Tang, Z.M. Tang, and J.Y. Yang Hierarchical initialization approach for k-means clustering Pattern Recognition Letters 29 6 2008 787 795
    • (2008) Pattern Recognition Letters , vol.29 , Issue.6 , pp. 787-795
    • Lu, J.F.1    Tang, J.B.2    Tang, Z.M.3    Yang, J.Y.4
  • 53
    • 77956705120 scopus 로고    scopus 로고
    • Simulating data to study performance of finite mixture modeling and clustering algorithms
    • R. Maitra, and V. Melnykov Simulating data to study performance of finite mixture modeling and clustering algorithms Journal of Computational and Graphical Statistics 19 2 2010 354 376
    • (2010) Journal of Computational and Graphical Statistics , vol.19 , Issue.2 , pp. 354-376
    • Maitra, R.1    Melnykov, V.2
  • 54
    • 0029752880 scopus 로고    scopus 로고
    • A self-organizing network for hyperellipsoidal clustering (HEC)
    • J. Mao, and A.K. Jain A self-organizing network for hyperellipsoidal clustering (HEC) IEEE Transacations on Neural Networks 7 1 1996 16 29
    • (1996) IEEE Transacations on Neural Networks , vol.7 , Issue.1 , pp. 16-29
    • Mao, J.1    Jain, A.K.2
  • 55
    • 0031599142 scopus 로고    scopus 로고
    • Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator
    • M. Matsumoto, and T. Nishimura Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator ACM Transactions on Modeling and Computer Simulation 8 1 1998 3 30
    • (1998) ACM Transactions on Modeling and Computer Simulation , vol.8 , Issue.1 , pp. 3-30
    • Matsumoto, M.1    Nishimura, T.2
  • 56
    • 33947156744 scopus 로고    scopus 로고
    • Comparing clusterings - An information based distance
    • M. Meila Comparing clusterings - An information based distance Journal of Multivariate Analysis 98 5 2007 873 895
    • (2007) Journal of Multivariate Analysis , vol.98 , Issue.5 , pp. 873-895
    • Meila, M.1
  • 57
    • 33847457966 scopus 로고
    • An examination of the effect of six types of error perturbation on fifteen clustering algorithms
    • G.W. Milligan An examination of the effect of six types of error perturbation on fifteen clustering algorithms Psychometrika 45 3 1980 325 342
    • (1980) Psychometrika , vol.45 , Issue.3 , pp. 325-342
    • Milligan, G.W.1
  • 58
    • 0000235019 scopus 로고
    • A study of standardization of variables in cluster analysis
    • G. Milligan, and M.C. Cooper A study of standardization of variables in cluster analysis Journal of Classification 5 2 1988 181 204
    • (1988) Journal of Classification , vol.5 , Issue.2 , pp. 181-204
    • Milligan, G.1    Cooper, M.C.2
  • 60
    • 84858689830 scopus 로고    scopus 로고
    • Careful seeding method based on independent components analysis for k-means clustering
    • T. Onoda, M. Sakai, and S. Yamada Careful seeding method based on independent components analysis for k-means clustering Journal of Emerging Technologies in Web Intelligence 4 1 2012 51 59
    • (2012) Journal of Emerging Technologies in Web Intelligence , vol.4 , Issue.1 , pp. 51-59
    • Onoda, T.1    Sakai, M.2    Yamada, S.3
  • 62
  • 63
    • 0033204902 scopus 로고    scopus 로고
    • An empirical comparison of four initialization methods for the k-means algorithm
    • J.M. Pena, J.A. Lozano, and P. Larranaga An empirical comparison of four initialization methods for the k-means algorithm Pattern Recognition Letters 20 10 1999 1027 1040
    • (1999) Pattern Recognition Letters , vol.20 , Issue.10 , pp. 1027-1040
    • Pena, J.M.1    Lozano, J.A.2    Larranaga, P.3
  • 65
    • 33947106382 scopus 로고    scopus 로고
    • A method for initialising the k-means clustering algorithm using kd-trees
    • S.J. Redmond, and C. Heneghan A method for initialising the k-means clustering algorithm using kd-trees Pattern Recognition Letters 28 8 2007 965 973
    • (2007) Pattern Recognition Letters , vol.28 , Issue.8 , pp. 965-973
    • Redmond, S.J.1    Heneghan, C.2
  • 66
    • 63549149056 scopus 로고    scopus 로고
    • SAS Institute Inc. SAS Publishing
    • SAS Institute Inc., SAS/STAT 9.2 User's Guide, SAS Publishing, 2009.
    • (2009) SAS/STAT 9.2 User's Guide
  • 67
    • 0021202650 scopus 로고
    • K-means-type algorithms: A generalized convergence theorem and characterization of local optimality
    • S.Z. Selim, and M.A. Ismail K-means-type algorithms: A generalized convergence theorem and characterization of local optimality IEEE Transactions on Pattern Analysis and Machine Intelligence 6 1 1984 81 87
    • (1984) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.6 , Issue.1 , pp. 81-87
    • Selim, S.Z.1    Ismail, M.A.2
  • 68
    • 85007280785 scopus 로고
    • Computational experiences with the exchange method: Applied to four commonly used partitioning cluster analysis criteria
    • H. Späth Computational experiences with the exchange method: Applied to four commonly used partitioning cluster analysis criteria European Journal of Operational Research 1 1 1977 23 31
    • (1977) European Journal of Operational Research , vol.1 , Issue.1 , pp. 23-31
    • Späth, H.1
  • 69
    • 41649118940 scopus 로고    scopus 로고
    • In search of deterministic methods for initializing k-means and Gaussian mixture clustering
    • T. Su, and J.G. Dy In search of deterministic methods for initializing k-means and Gaussian mixture clustering Intelligent Data Analysis 11 4 2007 319 338
    • (2007) Intelligent Data Analysis , vol.11 , Issue.4 , pp. 319-338
    • Su, T.1    Dy, J.G.2
  • 70
    • 0142025118 scopus 로고    scopus 로고
    • A computational study of several relocation methods for k-means algorithms
    • A. Tarsitano A computational study of several relocation methods for k-means algorithms Pattern Recognition 36 12 2003 2955 2966
    • (2003) Pattern Recognition , vol.36 , Issue.12 , pp. 2955-2966
    • Tarsitano, A.1
  • 74
    • 58349088069 scopus 로고    scopus 로고
    • External validation measures for k-means clustering: A data distribution perspective
    • J. Wu, J. Chen, H. Xiong, and M. Xie External validation measures for k-means clustering: A data distribution perspective Expert Systems with Applications 36 3 2009 6050 6061
    • (2009) Expert Systems with Applications , vol.36 , Issue.3 , pp. 6050-6061
    • Wu, J.1    Chen, J.2    Xiong, H.3    Xie, M.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.