메뉴 건너뛰기




Volumn , Issue , 2010, Pages 763-771

Document clustering via dirichlet process mixture model with feature selection

Author keywords

Dirichlet process mixture model; Document clustering; Feature selection

Indexed keywords

DATA SETS; DIRICHLET PROCESS MIXTURE MODEL; DOCUMENT CLUSTERING; DOCUMENT COLLECTION; DOCUMENT DATASETS; FEATURE SELECTION; NUMBER OF CLUSTERS; STOCHASTIC SEARCH; VARIABLE SELECTION;

EID: 77956209448     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1835804.1835901     Document Type: Conference Paper
Times cited : (50)

References (30)
  • 1
    • 0000708831 scopus 로고
    • Mixtures of dirichlet processes with applications to Bayesian nonparametric problems
    • C. Antoniak. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 2(6):1152-1174.
    • (1974) The Annals of Statistics , vol.2 , Issue.6 , pp. 1152-1174
    • Antoniak, C.1
  • 2
    • 0002617436 scopus 로고
    • Ferguson distribution via polya urn schemes
    • D. Blackwell and J. MacQueen. (1973). Ferguson distribution via Polya urn schemes. The Annals of Statistics, 1(2):353-355.
    • (1973) The Annals of Statistics , vol.1 , Issue.2 , pp. 353-355
    • Blackwell, D.1    MacQueen, J.2
  • 3
    • 84867186048 scopus 로고    scopus 로고
    • Variational inference for dirichlet process mixtures
    • D. Blei and M. Jordan. (2006). Variational inference for Dirichlet process mixtures. Bayesian Analysis, 1(1):121-144.
    • (2006) Bayesian Analysis , vol.1 , Issue.1 , pp. 121-144
    • Blei, D.1    Jordan, M.2
  • 4
    • 0040979817 scopus 로고
    • Determining the number of component clusters in the standard multivariate normal mixture model using model-selection criteria
    • University of Illinois, Chicago, IL
    • H. Bozdogan. (1983). Determining the number of component clusters in the standard multivariate normal mixture model using model-selection criteria. TR UIC/DQM/A83-1, Quantitative Methods Department, University of Illinois, Chicago, IL.
    • (1983) TR UIC/DQM/A83-1, Quantitative Methods Department
    • Bozdogan, H.1
  • 7
    • 0034824884 scopus 로고    scopus 로고
    • Concept decompositions for large sparse text data using clustering
    • I. S. Dhillon and D. S. Modha. (2001). Concept decompositions for large sparse text data using clustering. Journal of Machine Learning, 42(1):143-175.
    • (2001) Journal of Machine Learning , vol.42 , Issue.1 , pp. 143-175
    • Dhillon, I.S.1    Modha, D.S.2
  • 8
    • 0038791853 scopus 로고    scopus 로고
    • An information-theoretic external cluster-validity measure
    • B. E. Dom. (2001). An information-theoretic external cluster-validity measure. Research Report RJ 10219, IBM.
    • (2001) Research Report RJ 10219, IBM
    • Dom, B.E.1
  • 9
    • 33749257142 scopus 로고    scopus 로고
    • Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution
    • C. Elkan. (2006). Clustering Documents with an Exponential-Family Approximation of the Dirichlet Compound Multinomial Distribution. In Proceedings of the 23th International Conference on Machine Learning, 289-296.
    • (2006) Proceedings of the 23th International Conference on Machine Learning , pp. 289-296
    • Elkan, C.1
  • 10
    • 0001120413 scopus 로고
    • A Bayesian analysis of some nonparametric problems
    • T. Ferguson. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1:209-230.
    • (1973) The Annals of Statistics , vol.1 , pp. 209-230
    • Ferguson, T.1
  • 11
    • 0032269108 scopus 로고    scopus 로고
    • How many clusters? Which clustering method? Answers via model-based cluster analysis
    • C. Fraley and A. E. Raftery. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41(8):578-588.
    • (1998) The Computer Journal , vol.41 , Issue.8 , pp. 578-588
    • Fraley, C.1    Raftery, A.E.2
  • 13
    • 0035531242 scopus 로고    scopus 로고
    • Modelling heterogeneity with and without the dirichlet process
    • P. J. Green and S. M. Richardson. (2001). Modelling Heterogeneity with and without the Dirichlet Process. Scandinavian Journal of Statistics, 28:355-377.
    • (2001) Scandinavian Journal of Statistics , vol.28 , pp. 355-377
    • Green, P.J.1    Richardson, S.M.2
  • 15
    • 0036623091 scopus 로고    scopus 로고
    • Exact and approximate sum-representations for the dirichlet process
    • H. Ishwaran and M. Zarepour. (2002). Exact and Approximate Sum-Representations for the Dirichlet process. Canadian Journal of Statistics, 30:269-283.
    • (2002) Canadian Journal of Statistics , vol.30 , pp. 269-283
    • Ishwaran, H.1    Zarepour, M.2
  • 16
    • 33845734547 scopus 로고    scopus 로고
    • Variable selection in clustering via dirichlet process mixture models
    • S. Kim. (2006). Variable selection in clustering via Dirichlet process mixture models. Biometrika, 93(4):877-893.
    • (2006) Biometrika , vol.93 , Issue.4 , pp. 877-893
    • Kim, S.1
  • 21
    • 77950032550 scopus 로고    scopus 로고
    • Markov chain sampling methods for dirichlet process mixture models
    • R. Neal. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2):249-265.
    • (2000) Journal of Computational and Graphical Statistics , vol.9 , Issue.2 , pp. 249-265
    • Neal, R.1
  • 22
    • 0033886806 scopus 로고    scopus 로고
    • Text classification from labeled and unlabeled documents using EM
    • K. Nigam, A. K. McCallum, S. Thrun, and T. M. Mitchel. (2000). Text classification from labeled and unlabeled documents using EM. Journal of Machine Learning, 39(2/3):103-134.
    • (2000) Journal of Machine Learning , vol.39 , Issue.2-3 , pp. 103-134
    • Nigam, K.1    McCallum, A.K.2    Thrun, S.3    Mitchel, T.M.4
  • 23
    • 0018015137 scopus 로고
    • Modeling by shortest data description
    • J. Rissanen. (1978). Modeling by shortest data description. Automatica, 14:465-471.
    • (1978) Automatica , vol.14 , pp. 465-471
    • Rissanen, J.1
  • 24
    • 0032202775 scopus 로고    scopus 로고
    • Deterministic annealing for clustering, compression, classification, regression, and related optimization problems
    • K. Rose. (1998). Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. In Proceedings of the IEEE, 86(11):2210-2239.
    • (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2210-2239
    • Rose, K.1
  • 25
    • 0000120766 scopus 로고
    • Estimating the dimension of a model
    • G. Schwarz. (1978). Estimating the dimension of a model. The Annals of Statistics, 6:461-464.
    • (1978) The Annals of Statistics , vol.6 , pp. 461-464
    • Schwarz, G.1
  • 26
    • 33749013037 scopus 로고    scopus 로고
    • Semi-supervised model-based document clustering: A comparative study
    • Z. Shi. (2006). Semi-supervised model-based document clustering: A comparative study. Journal of Machine Learning, 65(1):3-29.
    • (2006) Journal of Machine Learning , vol.65 , Issue.1 , pp. 3-29
    • Shi, Z.1
  • 27
    • 2642528997 scopus 로고    scopus 로고
    • Model selection for probabilistic clustering using cross-validated likelihood
    • P. Smyth. (1998). Model selection for probabilistic clustering using cross-validated likelihood. ICS Tech Report 98-09, Statistics and Computing.
    • (1998) ICS Tech Report 98-09, Statistics and Computing
    • Smyth, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.