메뉴 건너뛰기




Volumn , Issue , 2010, Pages 904-910

A corpus factory for many languages

Author keywords

[No Author keywords available]

Indexed keywords

EIGHT LANGUAGES; LARGE CORPORA; QUERY TOOLS; SWEDISHS; VIETNAMESE;

EID: 85037339072     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (56)

References (21)
  • 1
    • 84977940268 scopus 로고    scopus 로고
    • Bootcat: Bootstrapping corpora and terms from the web
    • Marco Baroni and Silvia Bernardini. 2004. Bootcat: Bootstrapping corpora and terms from the web. In Proc. LREC, pages 1313-1316.
    • (2004) Proc. LREC , pp. 1313-1316
    • Baroni, M.1    Bernardini, S.2
  • 2
    • 70350681903 scopus 로고    scopus 로고
    • Large linguistically-processed web corpora for multiple languages
    • Marco Baroni and Adam Kilgarriff. 2006. Large linguistically-processed web corpora for multiple languages. In Proc. EACL, pages 87-90.
    • (2006) Proc. EACL , pp. 87-90
    • Baroni, M.1    Kilgarriff, A.2
  • 3
    • 57749208745 scopus 로고    scopus 로고
    • Distributions in text
    • Anke Lüdeling and Merja Kytö, editors Mouton de Gruyter, Berlin
    • Marco Baroni. 2005. Distributions in text. In Anke Lüdeling and Merja Kytö, editors, Corpus linguistics: An international handbook. Mouton de Gruyter, Berlin.
    • (2005) Corpus Linguistics: An International Handbook
    • Baroni, M.1
  • 7
    • 78049312146 scopus 로고    scopus 로고
    • Introducing and evaluating "ukwac", a very large web-derived corpus of English
    • Marrakech, Morocco
    • A. Ferraresi, E. Zanchetta, M. Baroni, and S. Bernardini. 2008. Introducing and evaluating "ukwac", a very large web-derived corpus of English. In Proc. WAC4 Workshop at LREC, Marrakech, Morocco.
    • (2008) Proc. WAC4 Workshop at LREC
    • Ferraresi, A.1    Zanchetta, E.2    Baroni, M.3    Bernardini, S.4
  • 8
    • 20344402818 scopus 로고    scopus 로고
    • Building minority language corpora by learning to generate web search queries
    • Rayid Ghani, Rosie Jones, and Dunja Mladenic. 2005. Building minority language corpora by learning to generate web search queries. Knowledge and Information Systems, 7(1):56-83.
    • (2005) Knowledge and Information Systems , vol.7 , Issue.1 , pp. 56-83
    • Ghani, R.1    Jones, R.2    Mladenic, D.3
  • 9
    • 0345138552 scopus 로고    scopus 로고
    • Estimation of english and non-English language use on the WWW
    • Gregory Grefenstette and Julien Nioche. 2000. Estimation of english and non-english language use on the WWW. In Proc. RIAO, pages 237-246.
    • (2000) Proc. RIAO , pp. 237-246
    • Grefenstette, G.1    Nioche, J.2
  • 10
    • 33846277160 scopus 로고    scopus 로고
    • The American national corpus: More than the web can provide
    • Las Palmas
    • Nancy Ide, Randi Reppen, and Keith Suderman. 2002. The American National Corpus: More than the web can provide. In Proc. LREC, pages 839-844, Las Palmas.
    • (2002) Proc. LREC , pp. 839-844
    • Ide, N.1    Reppen, R.2    Suderman, K.3
  • 11
    • 0344276035 scopus 로고    scopus 로고
    • Automatically building a corpus for a minority language from the web
    • Rosie Jones and Rayid Ghani. 2000. Automatically building a corpus for a minority language from the web. In Proc. ACL Student Workshop, pages 29-36.
    • (2000) Proc. ACL Student Workshop , pp. 29-36
    • Jones, R.1    Ghani, R.2
  • 12
    • 0344154400 scopus 로고    scopus 로고
    • Using the web to obtain frequencies for unseen bigrams
    • Frank Keller and Mirella Lapata. 2003. Using the web to obtain frequencies for unseen bigrams. Computational. Linguistics., 29(3):459-484.
    • (2003) Computational. Linguistics , vol.29 , Issue.3 , pp. 459-484
    • Keller, F.1    Lapata, M.2
  • 13
    • 0345570091 scopus 로고    scopus 로고
    • Lexical profiling software and its lexicographic applications: A case study
    • Copenhagen
    • Adam Kilgarriff and Michael Rundell. 2002. Lexical profiling software and its lexicographic applications: a case study. In Proc. EURALEX, pages 807-818, Copenhagen.
    • (2002) Proc. EURALEX , pp. 807-818
    • Kilgarriff, A.1    Rundell, M.2
  • 17
    • 84909993491 scopus 로고    scopus 로고
    • Detecting co-derivative documents in large text collections
    • Marrakech, Morocco
    • Jan Pomikálek and Pavel Rychlý. 2008. Detecting co-derivative documents in large text collections. In Proc. LREC, Marrakech, Morocco.
    • (2008) Proc. LREC
    • Pomikálek, J.1    Rychlý, P.2
  • 19
    • 85075757425 scopus 로고    scopus 로고
    • Mining the web for bilingual text
    • Philip Resnik. 1999. Mining the web for bilingual text. In Proc. ACL, pages 527-534.
    • (1999) Proc. ACL , pp. 527-534
    • Resnik, P.1
  • 20
    • 41149154578 scopus 로고    scopus 로고
    • The crubadan project: Corpus building for under-resourced languages
    • Louvain-la-Neuve, Belgium
    • Kevin P. Scannell. 2007. The crubadan project: Corpus building for under-resourced languages. In Proc. WAC-3: Building and Exploring Web Corpora, Louvain-la-Neuve, Belgium.
    • (2007) Proc. WAC-3: Building and Exploring Web Corpora
    • Scannell, K.P.1
  • 21
    • 42649127636 scopus 로고    scopus 로고
    • Creating general-purpose corpora using automated search engine queries
    • Gedit
    • Serge Sharoff. 2006. Creating general-purpose corpora using automated search engine queries. In WaCky! Working papers on the Web as Corpus. Gedit.
    • (2006) WaCky! Working Papers on the Web as Corpus
    • Sharoff, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.