메뉴 건너뛰기




Volumn , Issue , 2012, Pages 502-506

Building a 70 billion word corpus of English from ClueWeb

Author keywords

Clueweb; Corpus; Encoding; English; Word sketch

Indexed keywords

ENCODING (SYMBOLS); INDEXING (MATERIALS WORKING); QUERY LANGUAGES;

EID: 84907013032     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (17)

References (11)
  • 1
    • 84977940268 scopus 로고    scopus 로고
    • BootCaT: Bootstrapping corpora and terms from the web
    • Marco Baroni and Silvia Bernardini. 2004. BootCaT: Bootstrapping Corpora and Terms from the Web. In Proceedings of LREC, volume 4.
    • (2004) Proceedings of LREC , vol.4
    • Baroni, M.1    Bernardini, S.2
  • 2
    • 79956075292 scopus 로고    scopus 로고
    • Identifying and filtering near-duplicate documents
    • Springer
    • Andrei Broder. 2000. Identifying and Filtering Near-Duplicate Documents. In Combinatorial Pattern Matching, pages 1-10. Springer.
    • (2000) Combinatorial Pattern Matching , pp. 1-10
    • Broder, A.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.