-
1
-
-
84949179803
-
A survey of text clustering algorithms
-
C. C. Aggarwal and C. Zhai. A survey of text clustering algorithms. In Mining Text Data, pages 77-128. 2012.
-
In Mining Text Data
, vol.2012
, pp. 77-128
-
-
Aggarwal, C.C.1
Zhai, C.2
-
2
-
-
36448954984
-
Clustering short texts using wikipedia
-
S. Banerjee, K. Ramanathan, and A. Gupta. Clustering short texts using wikipedia. In SIGIR, pages 787-788, 2007.
-
(2007)
SIGIR
, pp. 787-788
-
-
Banerjee, S.1
Ramanathan, K.2
Gupta, A.3
-
7
-
-
33749257142
-
Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution
-
C. Elkan. Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution. In ICML, 2006.
-
(2006)
ICML
-
-
Elkan, C.1
-
8
-
-
33847172327
-
Clustering by passing messages between data points
-
B. J. Frey and D. Dueck. Clustering by passing messages between data points. Science, 315(5814):972-976, 2007.
-
(2007)
Science
, vol.315
, Issue.5814
, pp. 972-976
-
-
Frey, B.J.1
Dueck, D.2
-
10
-
-
85026972772
-
Probabilistic latent semantic indexing
-
T. Hofmann. Probabilistic latent semantic indexing. In SIGIR , pages 50-57, 1999.
-
(1999)
In SIGIR
, pp. 50-57
-
-
Hofmann, T.1
-
11
-
-
84897584095
-
Dirichlet process mixture model for document clustering with feature partition
-
R. Huang, G. Yu, Z. Wang, J. Zhang, and L. Shi. Dirichlet process mixture model for document clustering with feature partition. IEEE Trans. Knowl. Data Eng., 25(8):1748-1759, 2013.
-
(2013)
IEEE Trans. Knowl. Data Eng.
, vol.25
, Issue.8
, pp. 1748-1759
-
-
Huang, R.1
Yu, G.2
Wang, Z.3
Zhang, J.4
Shi, L.5
-
13
-
-
77950369345
-
Data clustering: 50 years beyond k-means
-
A. K. Jain. Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8):651-666, 2010.
-
(2010)
Pattern Recognition Letters
, vol.31
, Issue.8
, pp. 651-666
-
-
Jain, A.K.1
-
14
-
-
31844437086
-
Modeling word burstiness using the dirichlet distribution
-
R. E. Madsen, D. Kauchak, and C. Elkan. Modeling word burstiness using the dirichlet distribution. In ICML, pages 545-552, 2005.
-
(2005)
ICML
, pp. 545-552
-
-
Madsen, R.E.1
Kauchak, D.2
Elkan, C.3
-
16
-
-
0001673996
-
A comparison of event models for naive bayes text classification
-
Citeseer
-
A. McCallum, K. Nigam, et al. A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, volume 752, pages 41-48. Citeseer, 1998.
-
(1998)
AAAI-98 Workshop on Learning for Text Categorization
, vol.752
, pp. 41-48
-
-
McCallum, A.1
Nigam, K.2
-
19
-
-
78649420560
-
Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance
-
X. V. Nguyen, J. Epps, and J. Bailey. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11:2837-2854, 2010.
-
(2010)
Journal of Machine Learning Research
, vol.11
, pp. 2837-2854
-
-
Nguyen, X.V.1
Epps, J.2
Bailey, J.3
-
20
-
-
0033886806
-
Text classification from labeled and unlabeled documents using em
-
K. Nigam, A. McCallum, S. Thrun, and T. M. Mitchell. Text classification from labeled and unlabeled documents using em. Machine Learning, 39(2/3):103-134, 2000.
-
(2000)
Machine Learning
, vol.39
, Issue.2-3
, pp. 103-134
-
-
Nigam, K.1
McCallum, A.2
Thrun, S.3
Mitchell, T.M.4
-
21
-
-
79955132812
-
Comparative study of clustering techniques for short text documents
-
A. Rangrej, S. Kulkarni, and A. V. Tendulkar. Comparative study of clustering techniques for short text documents. In WWW (Companion Volume), pages 111-112, 2011.
-
(2011)
WWW (Companion Volume)
, pp. 111-112
-
-
Rangrej, A.1
Kulkarni, S.2
Tendulkar, A.V.3
-
23
-
-
1942484786
-
Tackling the poor assumptions of naive bayes text classifiers
-
J. D. Rennie, L. Shih, J. Teevan, and D. R. Karger. Tackling the poor assumptions of naive bayes text classifiers. In ICML, pages 616-623, 2003.
-
(2003)
ICML
, pp. 616-623
-
-
Rennie, J.D.1
Shih, L.2
Teevan, J.3
Karger, D.R.4
-
24
-
-
80053369934
-
V-measure: A conditional entropy-based external cluster evaluation measure
-
A. Rosenberg and J. Hirschberg. V-measure: A conditional entropy-based external cluster evaluation measure. In EMNLP-CoNLL, pages 410-420, 2007.
-
(2007)
EMNLP-CoNLL
, pp. 410-420
-
-
Rosenberg, A.1
Hirschberg, J.2
-
25
-
-
0016572913
-
A vector space model for automatic indexing
-
G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613-620, 1975.
-
(1975)
Commun. ACM
, vol.18
, Issue.11
, pp. 613-620
-
-
Salton, G.1
Wong, A.2
Yang, C.S.3
-
26
-
-
77954583359
-
Web-scale k-means clustering
-
D. Sculley. Web-scale k-means clustering. In WWW, pages 1177-1178, 2010.
-
(2010)
WWW
, pp. 1177-1178
-
-
Sculley, D.1
-
27
-
-
84883083690
-
Sumblr: Continuous summarization of evolving tweet streams
-
L. Shou, Z. Wang, K. Chen, and G. Chen. Sumblr: continuous summarization of evolving tweet streams. In SIGIR, pages 533-542, 2013.
-
(2013)
SIGIR
, pp. 533-542
-
-
Shou, L.1
Wang, Z.2
Chen, K.3
Chen, G.4
-
28
-
-
84900417810
-
Efficient clustering of short messages into general domains
-
O. Tsur, A. Littman, and A. Rappoport. Efficient clustering of short messages into general domains. In ICWSM , 2013.
-
(2013)
ICWSM
-
-
Tsur, O.1
Littman, A.2
Rappoport, A.3
-
29
-
-
84907028990
-
Clustering microtext streams for event identification
-
J. Yin. Clustering microtext streams for event identification. In IJCNLP, pages 719-725, 2013.
-
(2013)
IJCNLP
, pp. 719-725
-
-
Yin, J.1
-
30
-
-
77956209448
-
Document clustering via dirichlet process mixture model with feature selection
-
G. Yu, R. Huang, and Z. Wang. Document clustering via dirichlet process mixture model with feature selection. In KDD, pages 763-772, 2010.
-
(2010)
KDD
, pp. 763-772
-
-
Yu, G.1
Huang, R.2
Wang, Z.3
|