메뉴 건너뛰기




Volumn 63, Issue 2, 2007, Pages 503-527

A k-mean clustering algorithm for mixed numeric and categorical data

Author keywords

Clustering; Co occurrences; Cost function; Distance measure; k Mean clustering; Significance of attributes

Indexed keywords

CLUSTERING PROCESSES; DISTANCE MEASURE; MEAN CLUSTERING;

EID: 34447330447     PISSN: 0169023X     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.datak.2007.03.016     Document Type: Article
Times cited : (589)

References (51)
  • 3
    • 80053419954 scopus 로고    scopus 로고
    • F. Can, E. Ozkarahan, A dynamic cluster maintenance system for information retrieval, in: Proceedings of the Tenth Annual International ACM SIGIR Conference, 1987, pp. 123-131.
  • 4
    • 0032441150 scopus 로고    scopus 로고
    • M. Eissen, P. Spellman, P. Brown, D. Bostein, Cluster analysis and display of genome- wide expression patterns, in: Proceeding of National Academy of Sciences of USA, vol. 95, 1998, pp. 14863-14868.
  • 7
    • 34447336642 scopus 로고    scopus 로고
    • J.B. MacQuuen, Some methods for classification and analysis of multivariate observation, in: Proceedings of the 5th Berkley Symposium on Mathematical Statistics and Probability, 1967, pp. 281-297.
  • 10
    • 34447328035 scopus 로고    scopus 로고
    • R. Ng, J. Han, Efficient and effective clustering method for spatial data mining, in: Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 1994, pp. 144-155.
  • 11
    • 27144536001 scopus 로고    scopus 로고
    • Extensions to the K-modes algorithm for clustering large data sets with categorical values
    • Huang Z. Extensions to the K-modes algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery 2 3 (1998)
    • (1998) Data Mining and Knowledge Discovery , vol.2 , Issue.3
    • Huang, Z.1
  • 12
    • 34447330954 scopus 로고    scopus 로고
    • M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of KDD'96, 1996.
  • 13
    • 22044455069 scopus 로고    scopus 로고
    • Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications
    • Sander J., Ester M., Kriegel H.-P., and Xu X. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery 2 2 (1998) 169-194
    • (1998) Data Mining and Knowledge Discovery , vol.2 , Issue.2 , pp. 169-194
    • Sander, J.1    Ester, M.2    Kriegel, H.-P.3    Xu, X.4
  • 14
    • 0016046280 scopus 로고
    • Some recent investigations of a new fuzzy partitional algorithm and its application to pattern classification problems
    • Dunn J.C. Some recent investigations of a new fuzzy partitional algorithm and its application to pattern classification problems. Journal of Cybernetics 4 (1974) 1-15
    • (1974) Journal of Cybernetics , vol.4 , pp. 1-15
    • Dunn, J.C.1
  • 16
    • 0032595161 scopus 로고    scopus 로고
    • A fuzzy k-modes algorithm for clustering categorical data
    • Huang Z., and Ng M.K. A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems 7 4 (1999) 446-452
    • (1999) IEEE Transactions on Fuzzy Systems , vol.7 , Issue.4 , pp. 446-452
    • Huang, Z.1    Ng, M.K.2
  • 17
    • 4544378530 scopus 로고    scopus 로고
    • C. Döring, C. Borgelt, R. Kruse, Fuzzy clustering of quantitative and qualitative data, in: Proceedings of NAFIPS, Banff, Alberta, 2004.
  • 18
    • 0343442766 scopus 로고
    • Knowledge acquisition via incremental conceptual clustering
    • Fisher D.H. Knowledge acquisition via incremental conceptual clustering. Machine Learning 2 2 (1987) 139-172
    • (1987) Machine Learning , vol.2 , Issue.2 , pp. 139-172
    • Fisher, D.H.1
  • 19
    • 0000166613 scopus 로고
    • Experiments with incremental concept formation
    • Lebowitz M. Experiments with incremental concept formation. Machine Learning 2 2 (1987) 103-138
    • (1987) Machine Learning , vol.2 , Issue.2 , pp. 103-138
    • Lebowitz, M.1
  • 20
    • 34447304312 scopus 로고    scopus 로고
    • M. Gluck, J. Corter, Information, uncertainty, and the utility of categories, in: Proceedings of Seventh Annual Conference in Cognitive Society, 1985, pp. 283-287.
  • 21
    • 34447339604 scopus 로고    scopus 로고
    • K. McKusick, K. Thomson, COBWEB/3: A portable implementation, Technical Report FIA-90-6-18-2, NASA Ames Research Center, 1990.
  • 22
    • 0002908586 scopus 로고
    • The formation and use of abstract concepts in design
    • Fisher D.H., Pazzani M.J., and Langley P. (Eds), Morgan Kaufman, Los Altos, Calif
    • Reich Y., and Fenves S.J. The formation and use of abstract concepts in design. In: Fisher D.H., Pazzani M.J., and Langley P. (Eds). Concept Formation: Knowledge and Experience in Unsupervised Learning (1991), Morgan Kaufman, Los Altos, Calif 323-352
    • (1991) Concept Formation: Knowledge and Experience in Unsupervised Learning , pp. 323-352
    • Reich, Y.1    Fenves, S.J.2
  • 25
    • 85175741601 scopus 로고    scopus 로고
    • S. Guha, R. Rastogi, S. Kyuseok, ROCK: A robust clustering algorithm for categorical attributes, in: Proceedings of 15th International Conference on Data Engineering, Sydney, Australia, 23-26 March 1999, pp. 512-521.
  • 26
    • 34447338286 scopus 로고    scopus 로고
    • V. Ganti, J.E. Gekhre, R. Ramakrishnan, CACTUS-clustering categorical data using summaries, in: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999, pp. 73-83.
  • 27
    • 0042312608 scopus 로고    scopus 로고
    • Feature weighting in k-mean clustering
    • Modha D.S., and Spangler W.S. Feature weighting in k-mean clustering. Machine Learning 52 3 (2003) 217-237
    • (2003) Machine Learning , vol.52 , Issue.3 , pp. 217-237
    • Modha, D.S.1    Spangler, W.S.2
  • 28
    • 0030157145 scopus 로고    scopus 로고
    • T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: An efficient data clustering method for very large databases, in: SIGMOD Conference, 1996, pp. 103-114.
  • 30
    • 0032091595 scopus 로고    scopus 로고
    • S. Guha, R. Rastogi, K. Shim, CURE: An efficient clustering algorithm for clustering large databases, in: Proceedings of the Symposium on Management of Data (SIGMOD), 1998.
  • 31
    • 0032686723 scopus 로고    scopus 로고
    • CHAMELEON: A hierarchical clustering algorithm using dynamic modeling
    • Karypis G., Han E.H., and Kumar V. CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. IEEE Computer 32 8 (1999) 68-75
    • (1999) IEEE Computer , vol.32 , Issue.8 , pp. 68-75
    • Karypis, G.1    Han, E.H.2    Kumar, V.3
  • 32
    • 0001337675 scopus 로고
    • A new similarity index based on probability
    • Goodall D.W. A new similarity index based on probability. Biometric 22 (1966) 882-907
    • (1966) Biometric , vol.22 , pp. 882-907
    • Goodall, D.W.1
  • 35
    • 34447333064 scopus 로고    scopus 로고
    • H. Luo, F. Kong, Y. Li, Clustering mixed data based on evidence accumulation, in: X. Li, O.R. Zaiane, Z. Li (Eds.), ADMA 2006, Lecture Notes on Artificial Intelligence 4093.
  • 36
    • 27844433509 scopus 로고    scopus 로고
    • Scalable algorithms for clustering large datasets with mixed type attributes
    • He Z., Xu X., and Deng S. Scalable algorithms for clustering large datasets with mixed type attributes. International Journal of Intelligence Systems 20 (2005) 1077-1089
    • (2005) International Journal of Intelligence Systems , vol.20 , pp. 1077-1089
    • He, Z.1    Xu, X.2    Deng, S.3
  • 37
    • 0036740348 scopus 로고    scopus 로고
    • Squeezer: An efficient algorithms for clustering categorical data
    • He Z., Xu X., and Deng S. Squeezer: An efficient algorithms for clustering categorical data. Journal of Computer Science and Technology 17 5 (2002) 611-624
    • (2002) Journal of Computer Science and Technology , vol.17 , Issue.5 , pp. 611-624
    • He, Z.1    Xu, X.2    Deng, S.3
  • 38
    • 0022909661 scopus 로고
    • Toward memory based reasoning
    • Stanfill C., and Waltz D. Toward memory based reasoning. Communication of the ACM 29 12 (1986) 1213-1228
    • (1986) Communication of the ACM , vol.29 , Issue.12 , pp. 1213-1228
    • Stanfill, C.1    Waltz, D.2
  • 40
    • 35048857464 scopus 로고    scopus 로고
    • P. Andritsos, P. Tsaparas, R.J. Miller, K.C. Sevcik, LIMBO: Scalable clustering of categorical data, in: 9th International Conference on Extending DataBase Technology (EDBT), March 2004.
  • 41
    • 33750473714 scopus 로고    scopus 로고
    • A method to compute distance between two categorical values of same attributein unsupervised learning for categorical data set
    • Ahmad A., and Dey L. A method to compute distance between two categorical values of same attributein unsupervised learning for categorical data set. Pattern Recognition Letters 28 1 (2007) 110-118
    • (2007) Pattern Recognition Letters , vol.28 , Issue.1 , pp. 110-118
    • Ahmad, A.1    Dey, L.2
  • 42
    • 9644265275 scopus 로고    scopus 로고
    • A feature selection technique for classificatory analysis
    • Ahmad A., and Dey L. A feature selection technique for classificatory analysis. Pattern Recognition Letters 26 1 (2005) 43-56
    • (2005) Pattern Recognition Letters , vol.26 , Issue.1 , pp. 43-56
    • Ahmad, A.1    Dey, L.2
  • 43
    • 0032155316 scopus 로고    scopus 로고
    • Unsupervised feature selection using a neuro-fuzzy approach
    • Basak J., De R.K., and Pal S.K. Unsupervised feature selection using a neuro-fuzzy approach. Pattern Recognition Letters 19 (1998) 997-1006
    • (1998) Pattern Recognition Letters , vol.19 , pp. 997-1006
    • Basak, J.1    De, R.K.2    Pal, S.K.3
  • 46
    • 34447339350 scopus 로고    scopus 로고
    • A. Ahmad, L. Dey, A K-mean clustering algorithm for mixed numeric and categorical data set using dynamic distance measure, in: Proceedings of Fifth International Conference on Advances in Pattern Recognition, ICAPR2003, 2003.
  • 47
    • 17444410356 scopus 로고    scopus 로고
    • A k-populations algorithm for clustering categorical data
    • Won K.D., Lee K., Lee K.D., and Lee K.H. A k-populations algorithm for clustering categorical data. Pattern Recognition 38 7 (2005) 1131-1134
    • (2005) Pattern Recognition , vol.38 , Issue.7 , pp. 1131-1134
    • Won, K.D.1    Lee, K.2    Lee, K.D.3    Lee, K.H.4
  • 48
    • 0033204902 scopus 로고    scopus 로고
    • An empirical comparison of four initialization methods for the K-mean algorithm
    • Penã J.M., Lozano J.A., and Larra ñaga P. An empirical comparison of four initialization methods for the K-mean algorithm. Pattern Recognition Letters 20 (1999) 1027-1040
    • (1999) Pattern Recognition Letters , vol.20 , pp. 1027-1040
    • Penã, J.M.1    Lozano, J.A.2    Larra ñaga, P.3
  • 50
    • 23844528211 scopus 로고    scopus 로고
    • Cluster center initialization algorithm for K-mean clustering
    • Khan S.S., and Ahmad A. Cluster center initialization algorithm for K-mean clustering. Pattern Recognition Letters 25 (2004) 1293-1302
    • (2004) Pattern Recognition Letters , vol.25 , pp. 1293-1302
    • Khan, S.S.1    Ahmad, A.2
  • 51
    • 27144441097 scopus 로고    scopus 로고
    • An evaluation of statistical approaches to text categorization
    • Yang Y. An evaluation of statistical approaches to text categorization. Journal of Information Retrieval 1 1-2 (1999) 67-88
    • (1999) Journal of Information Retrieval , vol.1 , Issue.1-2 , pp. 67-88
    • Yang, Y.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.