메뉴 건너뛰기




Volumn , Issue , 2008, Pages 91-99

Learning to classify short and sparse text & web with hidden topics from large-scale data collections

Author keywords

Sparse text; Topic analysis; Web data analysis classification

Indexed keywords

CLASSIFIERS; DATA ACQUISITION; INFORMATION RETRIEVAL; INTERNET; KNOWLEDGE BASED SYSTEMS; LEARNING SYSTEMS; TEXT PROCESSING;

EID: 57349117605     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1367497.1367510     Document Type: Conference Paper
Times cited : (758)

References (36)
  • 2
    • 0032264186 scopus 로고    scopus 로고
    • Distributional clustering of words for text classification
    • L. Baker and A. McCallum. Distributional clustering of words for text classification. Proc. ACM SIGIR, 1998.
    • (1998) Proc. ACM SIGIR
    • Baker, L.1    McCallum, A.2
  • 5
    • 0002652285 scopus 로고    scopus 로고
    • A maximum entropy approach to natural language processing
    • A. Berger, A. Pietra, and J. Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39-71, 1996.
    • (1996) Computational Linguistics , vol.22 , Issue.1 , pp. 39-71
    • Berger, A.1    Pietra, A.2    Pietra, J.3
  • 6
    • 77951430107 scopus 로고    scopus 로고
    • Distributional word clusters vs. words for text categorization
    • R. Bekkerman, R. El-Yaniv, N. Tishby, and Y. Winter. Distributional word clusters vs. words for text categorization. JMLR, 3:1183-1208, 2003.
    • (2003) JMLR , vol.3 , pp. 1183-1208
    • Bekkerman, R.1    El-Yaniv, R.2    Tishby, N.3    Winter, Y.4
  • 7
    • 33745448357 scopus 로고    scopus 로고
    • A latent Dirichlet model for unsupervised entity resolution
    • I. Bhattacharya and L. Getoor. A latent Dirichlet model for unsupervised entity resolution. Proc. SIAM SDM, 2006.
    • (2006) Proc. SIAM SDM
    • Bhattacharya, I.1    Getoor, L.2
  • 8
    • 0141607824 scopus 로고    scopus 로고
    • Latent Dirichlet Allocation
    • D. Blei, A. Ng, and M. Jordan. Latent Dirichlet Allocation. JMLR, 3:993-1022, 2003.
    • (2003) JMLR , vol.3 , pp. 993-1022
    • Blei, D.1    Ng, A.2    Jordan, M.3
  • 10
    • 35348903881 scopus 로고    scopus 로고
    • Measuring semantic similarity between words using Web search engines
    • D. Bollegala, Y. Matsuo, and M. Ishizuka. Measuring semantic similarity between words using Web search engines. Proc. WWW, 2007.
    • (2007) Proc. WWW
    • Bollegala, D.1    Matsuo, Y.2    Ishizuka, M.3
  • 11
    • 0002993682 scopus 로고    scopus 로고
    • Combining labeled and unlabeled data with co-training
    • A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. Proc. COLT, 1998.
    • (1998) Proc. COLT
    • Blum, A.1    Mitchell, T.2
  • 12
    • 1542377542 scopus 로고    scopus 로고
    • Text categorization by boosting automatically extracted concepts
    • L. Cai and T. Hofmann. Text categorization by boosting automatically extracted concepts. Proc. ACM SIGIR, 2003.
    • (2003) Proc. ACM SIGIR
    • Cai, L.1    Hofmann, T.2
  • 16
    • 33745063446 scopus 로고    scopus 로고
    • Concept decompositions for large sparse text data using clustering
    • I. Dhillon and D. Modha. Concept decompositions for large sparse text data using clustering. Machine Learning, 29 (2-3): 103-130, 2001.
    • (2001) Machine Learning , vol.29 , Issue.2-3 , pp. 103-130
    • Dhillon, I.1    Modha, D.2
  • 17
    • 84880915872 scopus 로고    scopus 로고
    • Computing semantic relatedness using Wikipedia-based explicit semantic analysis
    • E. Gabrilovich and S. Markovitch. Computing semantic relatedness using Wikipedia-based explicit semantic analysis. Proc. IJCAI, 2007.
    • (2007) Proc. IJCAI
    • Gabrilovich, E.1    Markovitch, S.2
  • 18
    • 0021518209 scopus 로고
    • Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images
    • S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE PAMI, 6:721-741, 1984.
    • (1984) IEEE PAMI , vol.6 , pp. 721-741
    • Geman, S.1    Geman, D.2
  • 20
    • 0000636553 scopus 로고    scopus 로고
    • Text categorization with SVMs: Learning with many relevant features
    • T. Joachims. Text categorization with SVMs: learning with many relevant features. Proc. ECML, 1998.
    • (1998) Proc. ECML
    • Joachims, T.1
  • 21
    • 50249144225 scopus 로고    scopus 로고
    • Parameter estimation for text analysis
    • Technical report
    • G. Heinrich. Parameter estimation for text analysis. Technical report, 2005.
    • (2005)
    • Heinrich, G.1
  • 22
    • 57349147662 scopus 로고    scopus 로고
    • Probabilistic LSA
    • T. Hofmann. Probabilistic LSA. Proc. UAI, 1999.
    • (1999) Proc. UAI
    • Hofmann, T.1
  • 23
    • 3042742744 scopus 로고    scopus 로고
    • Latent semantic models for collaborative filtering
    • T. Hofmann. Latent semantic models for collaborative filtering. ACM TOIS, 22(1):89-115, 2004.
    • (2004) ACM TOIS , vol.22 , Issue.1 , pp. 89-115
    • Hofmann, T.1
  • 25
    • 19944406145 scopus 로고    scopus 로고
    • A hierarchical monothetic document clustering algorithm for summarization and browsing search results
    • K. Kummamuru, R. Lotlikar, S. Roy, K. Singal, and R. Krishnapuram. A hierarchical monothetic document clustering algorithm for summarization and browsing search results. Proc. WWW, 2004.
    • (2004) Proc. WWW
    • Kummamuru, K.1    Lotlikar, R.2    Roy, S.3    Singal, K.4    Krishnapuram, R.5
  • 26
    • 33646887390 scopus 로고
    • On the limited memory BFGS method for large-scale optimization
    • D. Liu and J. Nocedal. On the limited memory BFGS method for large-scale optimization. Mathematical Programming, 45:503-528, 1989.
    • (1989) Mathematical Programming , vol.45 , pp. 503-528
    • Liu, D.1    Nocedal, J.2
  • 27
    • 36348995566 scopus 로고    scopus 로고
    • Similarity measures for short segments of text
    • D. Metzler, S. Dumais, and C. Meek. Similarity measures for short segments of text. Proc. ECIR, 2007.
    • (2007) Proc. ECIR
    • Metzler, D.1    Dumais, S.2    Meek, C.3
  • 28
    • 1842751687 scopus 로고    scopus 로고
    • Expectation-propagation for the generative aspect model
    • T. Minka and J. Lafferty. Expectation-propagation for the generative aspect model. Proc. UAI, 2002.
    • (2002) Proc. UAI
    • Minka, T.1    Lafferty, J.2
  • 29
    • 0033886806 scopus 로고    scopus 로고
    • Text classification from labeled and unlabeled documents using EM
    • K. Nigram, A. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2-3): 103-134, 2000.
    • (2000) Machine Learning , vol.39 , Issue.2-3 , pp. 103-134
    • Nigram, K.1    McCallum, A.2    Thrun, S.3    Mitchell, T.4
  • 30
    • 34250638291 scopus 로고    scopus 로고
    • A Webnbased kernel function for measuring the similarity of short text snippets
    • M. Sahami and T. Heilman. A Webnbased kernel function for measuring the similarity of short text snippets. Proc. WWW, 2006.
    • (2006) Proc. WWW
    • Sahami, M.1    Heilman, T.2
  • 32
    • 0002442796 scopus 로고    scopus 로고
    • Machine learning in automated text categorization
    • F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1-47, 2002.
    • (2002) ACM Computing Surveys , vol.34 , Issue.1 , pp. 1-47
    • Sebastiani, F.1
  • 33
    • 33750327222 scopus 로고    scopus 로고
    • LDA-based document models for ad-hoc retrieval
    • X. Wei and W. Croft. LDA-based document models for ad-hoc retrieval. Proc. ACM SIGIR, 2006.
    • (2006) Proc. ACM SIGIR
    • Wei, X.1    Croft, W.2
  • 34
    • 57349094335 scopus 로고    scopus 로고
    • Improving similarity measures for short segments of text
    • W. Yih and C. Meek. Improving similarity measures for short segments of text. Proc. AAAI, 2007.
    • (2007) Proc. AAAI
    • Yih, W.1    Meek, C.2
  • 35
    • 0003227299 scopus 로고    scopus 로고
    • Grouper: A dynamic clustering interface to Web search results
    • O. Zamir and O. Etzioni. Grouper: a dynamic clustering interface to Web search results. Proc. WWW, 1999.
    • (1999) Proc. WWW
    • Zamir, O.1    Etzioni, O.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.