메뉴 건너뛰기




Volumn 42, Issue 5, 2006, Pages 1163-1175

A scaleable document clustering approach for large document corpora

Author keywords

Document clustering; Information retrieval

Indexed keywords

CLASSIFICATION (OF INFORMATION); DATA ACQUISITION; INFORMATION ANALYSIS; INFORMATION RETRIEVAL; QUALITY CONTROL; SEMANTICS;

EID: 33646474444     PISSN: 03064573     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ipm.2005.10.003     Document Type: Article
Times cited : (16)

References (26)
  • 2
    • 0032264186 scopus 로고    scopus 로고
    • Baker, L. D., & McCallum, A. K. (1998). Distributional clustering of words for text classification. In Proceedings of the 21st ACM international conference on research and development in information retrieval (SIGIR-98) (pp. 96-103).
  • 3
    • 0034796804 scopus 로고    scopus 로고
    • Bekkerman, R., El-Yaniv, R., Tishby, N., & Winter, Y. (2001). On feature distributional clustering for text categorization. In Proceedings of the 24th ACM international conference on research and development in information retrieval (SIGIR-01) (pp. 146-153).
  • 6
    • 0027029929 scopus 로고    scopus 로고
    • Cutting, D., Karger, D., Pedersen, J., & Tukey, J. (1992). Scatter/Gather: a cluster-based approach to browsing large document collections. In Proceedings of the 15th ACM international conference on research and development in information retrieval (SIGIR'92) (pp. 318-329).
  • 8
    • 2942723846 scopus 로고    scopus 로고
    • Divisive information-theoretic feature clustering algorithm for text classification
    • Dhillon I.S., Manella S., and Kumar R. Divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research 3 (2003) 1265-1287
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 1265-1287
    • Dhillon, I.S.1    Manella, S.2    Kumar, R.3
  • 10
    • 33646532191 scopus 로고    scopus 로고
    • Contextual document clustering
    • Proceedings of the 26th European conference on information retrieval research, Springer
    • Dobrynin V., Patterson D., and Rooney N. Contextual document clustering. Proceedings of the 26th European conference on information retrieval research. LNCS Vol. 2997 (2004), Springer 167-180
    • (2004) LNCS , vol.2997 , pp. 167-180
    • Dobrynin, V.1    Patterson, D.2    Rooney, N.3
  • 13
    • 0032091595 scopus 로고    scopus 로고
    • Guha, S., Rastogi, R., & Shim, K. (1998). CURE: An efficient clustering algorithm for large databases. In Proceedings of the ACM SIGMOD international conference on management of data (pp. 73-84).
  • 14
    • 0030381274 scopus 로고    scopus 로고
    • Hearst, M. A., & Pedersen, J. O. (1996). Reexamining the cluster hypothesis: scatter/gather on retrieval results. In Proceedings of the 19th international ACM SIGIR conference on research and development in information retrieval (SIGIR'96) (pp. 76-84).
  • 17
    • 0025952277 scopus 로고
    • Divergence measures based on the Shannon entropy
    • Lin J. Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory 37 1 (1991) 145-151
    • (1991) IEEE Transactions on Information Theory , vol.37 , Issue.1 , pp. 145-151
    • Lin, J.1
  • 18
    • 8644243122 scopus 로고    scopus 로고
    • Liu, X., & Croft, W. B. (2004). Cluster-based retrieval using language models. In Proceedings of the 27th annual international conference on research and development in information retrieval (SIGIR-04) (pp. 186-193).
  • 19
    • 0034592784 scopus 로고    scopus 로고
    • McCallum, A., Nigam, K., & Ungar, L. H. (2000). Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 169-178).
  • 20
    • 0032262815 scopus 로고    scopus 로고
    • Mechkour, M., Harper, D. J., & Muresan, G. (1998). The webcluster project: using clustering for mediating access to the WWW. In Proceedings of the 21st ACM international conference on research and development in information retrieval (SIGIR-98) (pp. 357-358).
  • 21
    • 33646478558 scopus 로고    scopus 로고
    • Pereira, F., Tishby, N., & Lee, L. (1993). Distributional clustering of English words. In 30th annual meeting of the association for computational linguistics (pp. 183-190), Columbus, Ohio.
  • 22
    • 84880733494 scopus 로고    scopus 로고
    • Rose, T., Stevenson, M., & Whitehead, M. (2002). The Reuters corpus Vol. 1-from yesterday's news to tomorrow's language resources. In Proceedings of the 3rd international conference on language resources and evaluation.
  • 23
    • 0002442796 scopus 로고    scopus 로고
    • Machine learning in automated text categorization
    • March
    • Sebastiani F. Machine learning in automated text categorization. ACM Computer Surveys 34 1 (2002) 1-47 March
    • (2002) ACM Computer Surveys , vol.34 , Issue.1 , pp. 1-47
    • Sebastiani, F.1
  • 24
    • 33646486384 scopus 로고    scopus 로고
    • Slonim, N., & Tishby, N. (2000). Document clustering using word clusters via the information bottleneck method. In Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval (SIGIR'00) (pp. 208-215).
  • 25
    • 0036993190 scopus 로고    scopus 로고
    • Slonim, N., Friedman, N., & Tishby, N. (2002). Unsupervised document classification using sequential information maximization. In Proceedings of the 25th annual international conference on research and development in information retrieval (SIGIR'02) (pp. 129-136).
  • 26
    • 33646479331 scopus 로고    scopus 로고
    • Tishby, N., Pereira, F., & Bialek, W. (1999). The Information bottleneck method. In Proceedings of the 37th annual allerton conference on communication, control, and computing (pp. 368-377).


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.