메뉴 건너뛰기




Volumn 4, Issue 1, 2011, Pages 11-29

A general framework for efficient clustering of large datasets based on activity detection

Author keywords

Activity detection; Clustering; Hierarchical clustering; K Means; Kd tree

Indexed keywords

APPROXIMATION ALGORITHMS; CLUSTERING ALGORITHMS; DATA MINING;

EID: 79551705053     PISSN: 19321872     EISSN: 19321864     Source Type: Journal    
DOI: 10.1002/sam.10097     Document Type: Article
Times cited : (8)

References (56)
  • 1
    • 85196054804 scopus 로고    scopus 로고
    • GAD: General activity detection for fast clustering on large data, In 2009 SIAM International Conference on Data Mining (SDM'09), Sparks, NV, USA
    • X.Jin, S.Kim, J.Han, L.Kao, and Z.Yin, GAD: General activity detection for fast clustering on large data, In 2009 SIAM International Conference on Data Mining (SDM'09), Sparks, NV, USA, 2009. 2-13.
    • (2009) , pp. 2-13
    • Jin, X.1    Kim, S.2    Han, J.3    Kao, L.4    Yin, Z.5
  • 2
    • 0003585297 scopus 로고    scopus 로고
    • Data Mining: Concepts and Techniques
    • (2nd ed.), Morgan Kaufmann Publishers, Burlington, MA,
    • J.Han and M.Kamber, Data Mining: Concepts and Techniques, (2nd ed.), Morgan Kaufmann Publishers, Burlington, MA, 2006.
    • (2006)
    • Han, J.1    Kamber, M.2
  • 4
    • 0001457509 scopus 로고
    • Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability
    • Some methods for classification and analysis of multivariate observations, In, and, eds. University of California Press, Berkeley, CA
    • J. B.MacQueen, Some methods for classification and analysis of multivariate observations, In Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, L. M.Le Cam, and J.Neyman, eds. University of California Press, Berkeley, CA, 1967, 281-297.
    • (1967) , vol.1 , pp. 281-297
    • MacQueen, J.B.1    Le Cam, L.M.2    Neyman, J.3
  • 5
    • 0003430544 scopus 로고
    • Finding Groups in Data: An Introduction to Cluster Analysis
    • John Wiley & Sons, Hoboken, NJ,
    • L.Kaufman and P. J.Rousueeuw, Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons, Hoboken, NJ, 1990.
    • (1990)
    • Kaufman, L.1    Rousueeuw, P.J.2
  • 6
    • 85196084103 scopus 로고
    • VLDB'94: Proceedings of the 20th International Conference on Very Large Data Bases
    • Efficient and effective clustering methods for spatial data mining, San Francisco, CA, Morgan Kaufmann Publishers Inc.
    • R. T.Ng and J.Han, Efficient and effective clustering methods for spatial data mining, VLDB'94: Proceedings of the 20th International Conference on Very Large Data Bases, San Francisco, CA, Morgan Kaufmann Publishers Inc., 1994, 144-155.
    • (1994) , pp. 144-155
    • Ng, R.T.1    Han, J.2
  • 9
    • 34548583274 scopus 로고    scopus 로고
    • A tutorial on spectral clustering
    • 4), .
    • L.Ulrike, A tutorial on spectral clustering, Statistics and Computing 17(4) (2007), 395-416.
    • (2007) Statistics and Computing , vol.17 , pp. 395-416
    • Ulrike, L.1
  • 10
    • 0030157145 scopus 로고    scopus 로고
    • BIRCH: An efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY
    • T.Zhang, R.Ramakrishnan, and M.Livny, BIRCH: An efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, 1996, 103-114.
    • (1996) , pp. 103-114
    • Zhang, T.1    Ramakrishnan, R.2    Livny, M.3
  • 11
    • 85196057555 scopus 로고    scopus 로고
    • ROCK: a robust clustering algorithm for categorical attributes, ICDE'99: Proceedings of the 15th International Conference on Data Engineering, Washington, DC, IEEE Computer Society,
    • S.Guha, R.Rastogi, and K.Shim, ROCK: a robust clustering algorithm for categorical attributes, ICDE'99: Proceedings of the 15th International Conference on Data Engineering, Washington, DC, IEEE Computer Society, 1999.
    • (1999)
    • Guha, S.1    Rastogi, R.2    Shim, K.3
  • 12
    • 0032686723 scopus 로고    scopus 로고
    • Chameleon: hierarchical clustering using dynamic modeling
    • (8), .
    • G.Karypis, E.-H.(Sam) Han, and V.Kumar, Chameleon: hierarchical clustering using dynamic modeling, Computer 32(8) (1999), 68-75.
    • (1999) Computer , vol.32 , pp. 68-75
    • Karypis, G.1    Han, E.-H.2    Kumar, V.3
  • 13
    • 85170282443 scopus 로고    scopus 로고
    • A density-based algorithm for discovering clusters in large spatial databases with noise, In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96)
    • M.Ester, H.-P.Kriegel, J.Sander, and X.Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), 1996, 226-231.
    • (1996) , pp. 226-231
    • Ester, M.1    Kriegel, H.-P.2    Sander, J.3    Xu, X.4
  • 14
    • 0347172110 scopus 로고    scopus 로고
    • OPTICS: ordering points to identify the clustering structure
    • (2), .
    • M.Ankerst, M. M.Breunig, H.-P.Kriegel, and J.Sander, OPTICS: ordering points to identify the clustering structure, SIGMOD Record 28(2) (1999), 49-60.
    • (1999) SIGMOD Record , vol.28 , pp. 49-60
    • Ankerst, M.1    Breunig, M.M.2    Kriegel, H.-P.3    Sander, J.4
  • 15
    • 85140527321 scopus 로고    scopus 로고
    • An efficient approach to clustering in large multimedia databases with noise, In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98)
    • A.Hinneburg and D. A.Keim, An efficient approach to clustering in large multimedia databases with noise, In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), 1998, 58-65.
    • (1998) , pp. 58-65
    • Hinneburg, A.1    Keim, D.A.2
  • 16
    • 84994158589 scopus 로고    scopus 로고
    • STING: A statistical information grid approach to spatial data mining, VLDB'97: Proceedings of the 23rd International Conference on Very Large Data Bases, San Francisco, CA, Morgan Kaufmann Publishers Inc.
    • W.Wang, J.Yang, and R. R.Muntz, STING: A statistical information grid approach to spatial data mining, VLDB'97: Proceedings of the 23rd International Conference on Very Large Data Bases, San Francisco, CA, Morgan Kaufmann Publishers Inc., 1997, 186-195.
    • (1997) , pp. 186-195
    • Wang, W.1    Yang, J.2    Muntz, R.R.3
  • 17
    • 0032090765 scopus 로고    scopus 로고
    • Automatic subspace clustering of high dimensional data for data mining applications, In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data
    • R.Agrawal, J.Gehrke, D.Gunopulos, and P.Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, 1998, 94-105.
    • (1998) , pp. 94-105
    • Agrawal, R.1    Gehrke, J.2    Gunopulos, D.3    Raghavan, P.4
  • 18
    • 0034133653 scopus 로고    scopus 로고
    • WaveCluster: a wavelet-based clustering approach for spatial data in very large databases
    • (34), .
    • G.Sheikholeslami, S.Chatterjee, and A.Zhang, WaveCluster: a wavelet-based clustering approach for spatial data in very large databases, The VLDB Journal 8(34) (2000), 289-304.
    • (2000) The VLDB Journal , vol.8 , pp. 289-304
    • Sheikholeslami, G.1    Chatterjee, S.2    Zhang, A.3
  • 19
    • 85196099240 scopus 로고    scopus 로고
    • RankClus: integrating clustering with ranking for heterogeneous information network analysis, EDBT'09: Proceedings of the 12th International Conference on Extending Database Technology, New York, NY, ACM
    • Y.Sun, J.Han, P.Zhao, Z.Yin, H.Cheng, and T.Wu, RankClus: integrating clustering with ranking for heterogeneous information network analysis, EDBT'09: Proceedings of the 12th International Conference on Extending Database Technology, New York, NY, ACM, 2009, 565-576.
    • (2009) , pp. 565-576
    • Sun, Y.1    Han, J.2    Zhao, P.3    Yin, Z.4    Cheng, H.5    Wu, T.6
  • 20
    • 70350625449 scopus 로고    scopus 로고
    • Ranking-based clustering of heterogeneous information networks with star network schema, KDD'09: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, ACM
    • Y.Sun, Y.Yu, and J.Han, Ranking-based clustering of heterogeneous information networks with star network schema, KDD'09: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, ACM, 2009, 797-806.
    • (2009) , pp. 797-806
    • Sun, Y.1    Yu, Y.2    Han, J.3
  • 22
    • 36949010345 scopus 로고    scopus 로고
    • SCAN: a structural clustering algorithm for networks, KDD'07: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, ACM
    • X.Xu, N.Yuruk, Z.Feng, and T. A. J.Schweiger, SCAN: a structural clustering algorithm for networks, KDD'07: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, ACM, 2007, 824-833.
    • (2007) , pp. 824-833
    • Xu, X.1    Yuruk, N.2    Feng, Z.3    Schweiger, T.A.J.4
  • 25
    • 0022136084 scopus 로고
    • An improvement of the minimum distortion encoding algorithm for vector quantization
    • C.-D.Bei and R. M.Gray, An improvement of the minimum distortion encoding algorithm for vector quantization, IEEE Transactions on Communication 33 (1985), 1132-1133.
    • (1985) IEEE Transactions on Communication , vol.33 , pp. 1132-1133
    • Bei, C.-D.1    Gray, R.M.2
  • 26
    • 0026237352 scopus 로고
    • Fast algorithm for vq codebook design, In Proceedings Inst. Elect. Eng.
    • S.-H.Chen and W. M.Hsieh, Fast algorithm for vq codebook design, In Proceedings Inst. Elect. Eng., vol. 138, 1991, 357-362.
    • (1991) , vol.138 , pp. 357-362
    • Chen, S.-H.1    Hsieh, W.M.2
  • 27
    • 85196083524 scopus 로고    scopus 로고
    • Using the triangle inequality to accelerate K-Means, In Twentieth International Conference on Machine Learning (ICML'03)
    • F.Beil, M.Ester, and X.Xu, Using the triangle inequality to accelerate K-Means, In Twentieth International Conference on Machine Learning (ICML'03), 2003, 147-153.
    • (2003) , pp. 147-153
    • Beil, F.1    Ester, M.2    Xu, X.3
  • 28
    • 0027658829 scopus 로고
    • A fast mean-distance-ordered partial codebook search algorithm for image vector quantization
    • S.-W.Ra and J.-K.Kim, A fast mean-distance-ordered partial codebook search algorithm for image vector quantization, IEEE Transactions on Circuits System 40 (1993), 576-579.
    • (1993) IEEE Transactions on Circuits System , vol.40 , pp. 576-579
    • Ra, S.-W.1    Kim, J.-K.2
  • 29
    • 0038192328 scopus 로고    scopus 로고
    • An efficient encoding algorithm for vector quantization based on subvector technique
    • (3), .
    • J.-S.Pan, Z.-M.Lu, and S.-H.Sun, An efficient encoding algorithm for vector quantization based on subvector technique, IEEE Transactions on Image Processing 12(3) (2003), 265-270.
    • (2003) IEEE Transactions on Image Processing , vol.12 , pp. 265-270
    • Pan, J.-S.1    Lu, Z.-M.2    Sun, S.-H.3
  • 30
    • 0030413515 scopus 로고    scopus 로고
    • Two fast nearest neighbor searching algorithms for image vector quantization
    • (12), .
    • S. C.Tai, C. C.Lai, and Y. C.Lin, Two fast nearest neighbor searching algorithms for image vector quantization, IEEE Transactions on Communication 44(12) (1996), 1623-1628.
    • (1996) IEEE Transactions on Communication , vol.44 , pp. 1623-1628
    • Tai, S.C.1    Lai, C.C.2    Lin, Y.C.3
  • 31
    • 9744228805 scopus 로고    scopus 로고
    • Fast-searching algorithm for vector quantization using projection and triangular inequality
    • (12), .
    • J. Z. C.Lai and Y.-C.Liaw, Fast-searching algorithm for vector quantization using projection and triangular inequality, IEEE Transactions on Image Processing 13(12) (2004), 1554-1558.
    • (2004) IEEE Transactions on Image Processing , vol.13 , pp. 1554-1558
    • Lai, J.Z.C.1    Liaw, Y.-C.2
  • 32
    • 85196071420 scopus 로고    scopus 로고
    • Accelerating exact K-Means algorithms with geometric reasoning, Proceedings of KDD'99, New York, NY, ACM
    • D.Pelleg and A.Moore, Accelerating exact K-Means algorithms with geometric reasoning, Proceedings of KDD'99, New York, NY, ACM, 1999, 277-281.
    • (1999) , pp. 277-281
    • Pelleg, D.1    Moore, A.2
  • 33
    • 85196067691 scopus 로고    scopus 로고
    • Object retrieval with large vocabularies and fast spatial matching, In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007)
    • J.Philbin, O.Chum, M.Isard, J.Sivic, and A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007. 1-8.
    • (2007) , pp. 1-8
    • Philbin, J.1    Chum, O.2    Isard, M.3    Sivic, J.4    Zisserman, A.5
  • 34
    • 33845592987 scopus 로고    scopus 로고
    • Scalable recognition with a vocabulary tree, In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006)
    • D.Nister and H.Stewenius, Scalable recognition with a vocabulary tree, In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, 2006, 2161-2168.
    • (2006) , vol.2 , pp. 2161-2168
    • Nister, D.1    Stewenius, H.2
  • 35
    • 34548049041 scopus 로고    scopus 로고
    • A fast vq codebook generation algorithm using codeword displacement
    • (1), .
    • J. Z. C.Lai, Y.-C.Liaw, and J.Liu, A fast vq codebook generation algorithm using codeword displacement, Pattern Recognition 41(1) (2008), 315-319.
    • (2008) Pattern Recognition , vol.41 , pp. 315-319
    • Lai, J.Z.C.1    Liaw, Y.-C.2    Liu, J.3
  • 36
    • 0018999905 scopus 로고
    • Multidimensional divide-and-conquer
    • 4), .
    • J. L.Bentley, Multidimensional divide-and-conquer, Communications of the ACM 23(4) (1980), 214-229.
    • (1980) Communications of the ACM , vol.23 , pp. 214-229
    • Bentley, J.L.1
  • 37
    • 85196110801 scopus 로고    scopus 로고
    • Locality sensitive hashing scheme based on p-stable distribution, SCG'04: Proceedings of the Twentieth Annual Symposium on Computational Geometry, ACM Press, New York, NY
    • M.Datar, N.Immorlica, P.Indyk, and V. S.Mirrokni, Locality sensitive hashing scheme based on p-stable distribution, SCG'04: Proceedings of the Twentieth Annual Symposium on Computational Geometry, ACM Press, New York, NY, 2004, 253-262.
    • (2004) , pp. 253-262
    • Datar, M.1    Immorlica, N.2    Indyk, P.3    Mirrokni, V.S.4
  • 39
    • 0038391443 scopus 로고    scopus 로고
    • Bagging to improve the accuracy of a clustering procedure
    • (9), .
    • S.Dudoit and J.Fridlyand, Bagging to improve the accuracy of a clustering procedure, Bioinformatics 19(9) (2003), 1090-1099.
    • (2003) Bioinformatics , vol.19 , pp. 1090-1099
    • Dudoit, S.1    Fridlyand, J.2
  • 41
    • 0041965980 scopus 로고    scopus 로고
    • Cluster ensembles - a knowledge reuse framework for combining multiple partitions
    • A.Strehl and J.Ghosh, Cluster ensembles - a knowledge reuse framework for combining multiple partitions, Journal of Machine Learning Research 3 (2002), 583-617.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 583-617
    • Strehl, A.1    Ghosh, J.2
  • 44
    • 85196059734 scopus 로고    scopus 로고
    • 80 million tiny images: a large dataset for non-parametric object and scene recognition, Technical Report MIT-CSAIL-TR-2007-024, MIT,
    • A.Torralba, R.Fergus, and W. T.Freeman, 80 million tiny images: a large dataset for non-parametric object and scene recognition, Technical Report MIT-CSAIL-TR-2007-024, MIT, 2007.
    • (2007)
    • Torralba, A.1    Fergus, R.2    Freeman, W.T.3
  • 45
    • 85196065942 scopus 로고    scopus 로고
    • 80 millioin tiny images. (accessed 2008).
    • A.Torralba, 80 millioin tiny images. 2008. (accessed 2008).
    • (2008)
    • Torralba, A.1
  • 46
    • 85196072605 scopus 로고    scopus 로고
    • KDDCUP04, Kddcup 04 biology dataset, (accessed 2008).
    • KDDCUP04, Kddcup 04 biology dataset, 2008. (accessed 2008).
    • (2008)
  • 47
    • 0003946510 scopus 로고    scopus 로고
    • Principal Component Analysis
    • Springer-Verlag, New York, NY,
    • I. T.Jolliffe, Principal Component Analysis, Springer-Verlag, New York, NY, 2002.
    • (2002)
    • Jolliffe, I.T.1
  • 49
    • 34250871625 scopus 로고    scopus 로고
    • Initializing K-Means batch clustering: a critical evaluation of several techniques
    • (1), .
    • D.Steinley and M. J.Brusco, Initializing K-Means batch clustering: a critical evaluation of several techniques, Journal of Classification 24(1) (2007), 99-121.
    • (2007) Journal of Classification , vol.24 , pp. 99-121
    • Steinley, D.1    Brusco, M.J.2
  • 50
    • 85196099515 scopus 로고
    • Stanford Artificial Intelligence Project Memorandum AIM-124
    • Speech analysis by clustering, or the hyperphome method, Stanford, CA, Stanford University,
    • M. M.Astrahan, Speech analysis by clustering, or the hyperphome method, Stanford Artificial Intelligence Project Memorandum AIM-124, Stanford, CA, Stanford University, 1970.
    • (1970)
    • Astrahan, M.M.1
  • 51
    • 33846586640 scopus 로고    scopus 로고
    • Clustering for Data Mining: A Data Recovery Approach
    • London, Chapman and Hall,
    • BorisMirkin, Clustering for Data Mining: A Data Recovery Approach, London, Chapman and Hall, 2005.
    • (2005)
    • Mirkin, B.1
  • 52
    • 16244421048 scopus 로고    scopus 로고
    • Another look at non-random methods for initializing K-Means clustering, In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
    • T.Su, and J. G.Dy, Another look at non-random methods for initializing K-Means clustering, In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence, 2004, 784-786.
    • (2004) , pp. 784-786
    • Su, T.1    Dy, J.G.2
  • 54
    • 0001820920 scopus 로고    scopus 로고
    • X-means: extending K-Means with efficient estimation of the number of clusters, ICML'00: Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, Morgan Kaufmann Publishers Inc.
    • D.Pelleg and A. W.Moore, X-means: extending K-Means with efficient estimation of the number of clusters, ICML'00: Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, Morgan Kaufmann Publishers Inc., 2000, 727-734.
    • (2000) , pp. 727-734
    • Pelleg, D.1    Moore, A.W.2
  • 55
    • 0000014486 scopus 로고
    • Cluster analysis of multivariate data: efficiency vs interpretability of classifications
    • E. W.Forgy, Cluster analysis of multivariate data: efficiency vs interpretability of classifications. Biometrics 21 (1965), 768-769.
    • (1965) Biometrics , vol.21 , pp. 768-769
    • Forgy, E.W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.