메뉴 건너뛰기




Volumn , Issue , 2014, Pages 233-242

A Dirichlet multinomial mixture model-based approach for short text clustering

Author keywords

dirichlet multinomial mixture; gibbs sampling; short text clustering

Indexed keywords


EID: 84907033074     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2623330.2623715     Document Type: Conference Paper
Times cited : (546)

References (30)
  • 1
    • 84949179803 scopus 로고    scopus 로고
    • A survey of text clustering algorithms
    • C. C. Aggarwal and C. Zhai. A survey of text clustering algorithms. In Mining Text Data, pages 77-128. 2012.
    • In Mining Text Data , vol.2012 , pp. 77-128
    • Aggarwal, C.C.1    Zhai, C.2
  • 2
    • 36448954984 scopus 로고    scopus 로고
    • Clustering short texts using wikipedia
    • S. Banerjee, K. Ramanathan, and A. Gupta. Clustering short texts using wikipedia. In SIGIR, pages 787-788, 2007.
    • (2007) SIGIR , pp. 787-788
    • Banerjee, S.1    Ramanathan, K.2    Gupta, A.3
  • 7
    • 33749257142 scopus 로고    scopus 로고
    • Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution
    • C. Elkan. Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution. In ICML, 2006.
    • (2006) ICML
    • Elkan, C.1
  • 8
    • 33847172327 scopus 로고    scopus 로고
    • Clustering by passing messages between data points
    • B. J. Frey and D. Dueck. Clustering by passing messages between data points. Science, 315(5814):972-976, 2007.
    • (2007) Science , vol.315 , Issue.5814 , pp. 972-976
    • Frey, B.J.1    Dueck, D.2
  • 10
    • 85026972772 scopus 로고    scopus 로고
    • Probabilistic latent semantic indexing
    • T. Hofmann. Probabilistic latent semantic indexing. In SIGIR , pages 50-57, 1999.
    • (1999) In SIGIR , pp. 50-57
    • Hofmann, T.1
  • 11
    • 84897584095 scopus 로고    scopus 로고
    • Dirichlet process mixture model for document clustering with feature partition
    • R. Huang, G. Yu, Z. Wang, J. Zhang, and L. Shi. Dirichlet process mixture model for document clustering with feature partition. IEEE Trans. Knowl. Data Eng., 25(8):1748-1759, 2013.
    • (2013) IEEE Trans. Knowl. Data Eng. , vol.25 , Issue.8 , pp. 1748-1759
    • Huang, R.1    Yu, G.2    Wang, Z.3    Zhang, J.4    Shi, L.5
  • 13
    • 77950369345 scopus 로고    scopus 로고
    • Data clustering: 50 years beyond k-means
    • A. K. Jain. Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8):651-666, 2010.
    • (2010) Pattern Recognition Letters , vol.31 , Issue.8 , pp. 651-666
    • Jain, A.K.1
  • 14
    • 31844437086 scopus 로고    scopus 로고
    • Modeling word burstiness using the dirichlet distribution
    • R. E. Madsen, D. Kauchak, and C. Elkan. Modeling word burstiness using the dirichlet distribution. In ICML, pages 545-552, 2005.
    • (2005) ICML , pp. 545-552
    • Madsen, R.E.1    Kauchak, D.2    Elkan, C.3
  • 16
    • 0001673996 scopus 로고    scopus 로고
    • A comparison of event models for naive bayes text classification
    • Citeseer
    • A. McCallum, K. Nigam, et al. A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, volume 752, pages 41-48. Citeseer, 1998.
    • (1998) AAAI-98 Workshop on Learning for Text Categorization , vol.752 , pp. 41-48
    • McCallum, A.1    Nigam, K.2
  • 19
    • 78649420560 scopus 로고    scopus 로고
    • Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance
    • X. V. Nguyen, J. Epps, and J. Bailey. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11:2837-2854, 2010.
    • (2010) Journal of Machine Learning Research , vol.11 , pp. 2837-2854
    • Nguyen, X.V.1    Epps, J.2    Bailey, J.3
  • 20
    • 0033886806 scopus 로고    scopus 로고
    • Text classification from labeled and unlabeled documents using em
    • K. Nigam, A. McCallum, S. Thrun, and T. M. Mitchell. Text classification from labeled and unlabeled documents using em. Machine Learning, 39(2/3):103-134, 2000.
    • (2000) Machine Learning , vol.39 , Issue.2-3 , pp. 103-134
    • Nigam, K.1    McCallum, A.2    Thrun, S.3    Mitchell, T.M.4
  • 21
    • 79955132812 scopus 로고    scopus 로고
    • Comparative study of clustering techniques for short text documents
    • A. Rangrej, S. Kulkarni, and A. V. Tendulkar. Comparative study of clustering techniques for short text documents. In WWW (Companion Volume), pages 111-112, 2011.
    • (2011) WWW (Companion Volume) , pp. 111-112
    • Rangrej, A.1    Kulkarni, S.2    Tendulkar, A.V.3
  • 23
    • 1942484786 scopus 로고    scopus 로고
    • Tackling the poor assumptions of naive bayes text classifiers
    • J. D. Rennie, L. Shih, J. Teevan, and D. R. Karger. Tackling the poor assumptions of naive bayes text classifiers. In ICML, pages 616-623, 2003.
    • (2003) ICML , pp. 616-623
    • Rennie, J.D.1    Shih, L.2    Teevan, J.3    Karger, D.R.4
  • 24
    • 80053369934 scopus 로고    scopus 로고
    • V-measure: A conditional entropy-based external cluster evaluation measure
    • A. Rosenberg and J. Hirschberg. V-measure: A conditional entropy-based external cluster evaluation measure. In EMNLP-CoNLL, pages 410-420, 2007.
    • (2007) EMNLP-CoNLL , pp. 410-420
    • Rosenberg, A.1    Hirschberg, J.2
  • 25
    • 0016572913 scopus 로고
    • A vector space model for automatic indexing
    • G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613-620, 1975.
    • (1975) Commun. ACM , vol.18 , Issue.11 , pp. 613-620
    • Salton, G.1    Wong, A.2    Yang, C.S.3
  • 26
    • 77954583359 scopus 로고    scopus 로고
    • Web-scale k-means clustering
    • D. Sculley. Web-scale k-means clustering. In WWW, pages 1177-1178, 2010.
    • (2010) WWW , pp. 1177-1178
    • Sculley, D.1
  • 27
    • 84883083690 scopus 로고    scopus 로고
    • Sumblr: Continuous summarization of evolving tweet streams
    • L. Shou, Z. Wang, K. Chen, and G. Chen. Sumblr: continuous summarization of evolving tweet streams. In SIGIR, pages 533-542, 2013.
    • (2013) SIGIR , pp. 533-542
    • Shou, L.1    Wang, Z.2    Chen, K.3    Chen, G.4
  • 28
    • 84900417810 scopus 로고    scopus 로고
    • Efficient clustering of short messages into general domains
    • O. Tsur, A. Littman, and A. Rappoport. Efficient clustering of short messages into general domains. In ICWSM , 2013.
    • (2013) ICWSM
    • Tsur, O.1    Littman, A.2    Rappoport, A.3
  • 29
    • 84907028990 scopus 로고    scopus 로고
    • Clustering microtext streams for event identification
    • J. Yin. Clustering microtext streams for event identification. In IJCNLP, pages 719-725, 2013.
    • (2013) IJCNLP , pp. 719-725
    • Yin, J.1
  • 30
    • 77956209448 scopus 로고    scopus 로고
    • Document clustering via dirichlet process mixture model with feature selection
    • G. Yu, R. Huang, and Z. Wang. Document clustering via dirichlet process mixture model with feature selection. In KDD, pages 763-772, 2010.
    • (2010) KDD , pp. 763-772
    • Yu, G.1    Huang, R.2    Wang, Z.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.