|
Volumn , Issue , 2012, Pages 502-506
|
Building a 70 billion word corpus of English from ClueWeb
|
Author keywords
Clueweb; Corpus; Encoding; English; Word sketch
|
Indexed keywords
ENCODING (SYMBOLS);
INDEXING (MATERIALS WORKING);
QUERY LANGUAGES;
CLUEWEB;
CORPUS;
ENGLISH;
LANGUAGE RESOURCES;
MANAGEMENT SYSTEMS;
NEAR- DUPLICATES;
PRE-PROCESSING STEP;
WORD SKETCH;
DATA HANDLING;
|
EID: 84907013032
PISSN: None
EISSN: None
Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper |
Times cited : (17)
|
References (11)
|