메뉴 건너뛰기




Volumn 1, Issue , 2011, Pages 93-101

Semi-Supervised SimHash for efficient document similarity search

Author keywords

[No Author keywords available]

Indexed keywords

DOCUMENT SIMILARITY; FEATURE WEIGHT; HASHING METHOD; HIGH DIMENSIONAL DATA; LARGE DATASETS; PRIOR KNOWLEDGE; QUERY DOCUMENTS; SEMI-SUPERVISED; STATE-OF-THE-ART METHODS;

EID: 84859084716     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (27)

References (23)
  • 1
    • 27144489164 scopus 로고    scopus 로고
    • A tutorial on support vector machines for pattern recognition
    • Christopher J.C. Burges. 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121-167. (Pubitemid 128695475)
    • (1998) Data Mining and Knowledge Discovery , vol.2 , Issue.2 , pp. 121-167
    • Burges, C.J.C.1
  • 3
    • 76749114248 scopus 로고    scopus 로고
    • An incremental clustering scheme for data deduplication
    • Gianni Costa, Giuseppe Manco and Riccardo Ortale. 2010. An incremental clustering scheme for data deduplication. Data Mining and Knowledge Discovery, 20(1):152-187.
    • (2010) Data Mining and Knowledge Discovery , vol.20 , Issue.1 , pp. 152-187
    • Costa, G.1    Manco, G.2    Ortale, R.3
  • 4
    • 0033293618 scopus 로고    scopus 로고
    • Finding related pages in the world wide web
    • Jeffrey Dean and Monika R. Henzinge. 1999. Finding Related Pages in the World Wide Web. Computer Networks, 31:1467-1479.
    • (1999) Computer Networks , vol.31 , pp. 1467-1479
    • Dean, J.1    Henzinge, M.R.2
  • 6
    • 2942731012 scopus 로고    scopus 로고
    • An extensive empirical study of feature selection metrics for text classification
    • George Forman 2003. An extensive empirical study of feature selection metrics for text classification. The Journal of Machine Learning Research, 3:1289-1305.
    • (2003) The Journal of Machine Learning Research , vol.3 , pp. 1289-1305
    • Forman, G.1
  • 12
    • 33646887390 scopus 로고
    • On the limited memory BFGS method for large scale optimization
    • Dong C. Liu and Jorge Nocedal. 1989. On the limited memory BFGS method for large scale optimization. Mathematical programming, 45(1): 503-528. (Pubitemid 20660315)
    • (1989) Mathematical Programming, Series B , vol.45 , Issue.3 , pp. 503-528
    • Liu Dong, C.1    Nocedal Jorge2
  • 14
    • 35348911985 scopus 로고    scopus 로고
    • Detecting near-duplicates for web crawling
    • DOI 10.1145/1242572.1242592, 16th International World Wide Web Conference, WWW2007
    • Gurmeet Singh Manku, Arvind Jain and Anish Das Sarma. 2007. Detecting near-duplicates for web crawling. In Proceedings of the 16th international conference on World Wide Web, pages 141-150. (Pubitemid 47582246)
    • (2007) 16th International World Wide Web Conference, WWW2007 , pp. 141-150
    • Manku, G.S.1    Jain, A.2    Das Sarma, A.3
  • 15
    • 84859089262 scopus 로고    scopus 로고
    • An introduction to information retrieval
    • Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. 2002. An introduction to information retrieval. Spring.
    • (2002) Spring
    • Manning, C.D.1    Raghavan, P.2    Schütze, H.3
  • 18
    • 84956689194 scopus 로고    scopus 로고
    • Kernel Principal Component Analysis
    • Artificial Neural Networks - ICANN '97
    • Bernhard Schölkopf, Alexander Smola and Klaus-Robert Müller. 1997. Kernel principal component analysis. Advances in Kernel Methods - Support Vector Learning, pages 583-588. MIT. (Pubitemid 127140297)
    • (1997) Lecture Notes in Computer Science , Issue.1327 , pp. 583-588
    • Schoelkopf, B.1    Smola, A.J.2    Mueller, K.-R.3
  • 20
    • 40649129226 scopus 로고    scopus 로고
    • Towards a unified approach to document similarity search using manifold-ranking of blocks
    • DOI 10.1016/j.ipm.2007.07.012, PII S0306457307001562
    • Xiaojun Wan, Jianwu Yang and Jianguo Xiao. 2008. Towards a unified approach to document similarity search using manifold-ranking of blocks. Information Processing & Management, 44(3):1032-1048. (Pubitemid 351375314)
    • (2008) Information Processing and Management , vol.44 , Issue.3 , pp. 1032-1048
    • Wan, X.1    Yang, J.2    Xiao, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.