-
1
-
-
84977940268
-
Bootcat: Bootstrapping corpora and terms from the web
-
Marco Baroni and Silvia Bernardini. 2004. Bootcat: Bootstrapping corpora and terms from the web. In Proc. LREC, pages 1313-1316.
-
(2004)
Proc. LREC
, pp. 1313-1316
-
-
Baroni, M.1
Bernardini, S.2
-
2
-
-
70350681903
-
Large linguistically-processed web corpora for multiple languages
-
Marco Baroni and Adam Kilgarriff. 2006. Large linguistically-processed web corpora for multiple languages. In Proc. EACL, pages 87-90.
-
(2006)
Proc. EACL
, pp. 87-90
-
-
Baroni, M.1
Kilgarriff, A.2
-
3
-
-
57749208745
-
Distributions in text
-
Anke Lüdeling and Merja Kytö, editors Mouton de Gruyter, Berlin
-
Marco Baroni. 2005. Distributions in text. In Anke Lüdeling and Merja Kytö, editors, Corpus linguistics: An international handbook. Mouton de Gruyter, Berlin.
-
(2005)
Corpus Linguistics: An International Handbook
-
-
Baroni, M.1
-
6
-
-
0010362121
-
Syntactic clustering of the web
-
Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse, and Geoffrey Zweig. 1997. Syntactic clustering of the web. Computer Networks, 29(8-13):1157-1166.
-
(1997)
Computer Networks
, vol.29
, Issue.8-13
, pp. 1157-1166
-
-
Broder, A.Z.1
Glassman, S.C.2
Manasse, M.S.3
Zweig, G.4
-
7
-
-
78049312146
-
Introducing and evaluating "ukwac", a very large web-derived corpus of English
-
Marrakech, Morocco
-
A. Ferraresi, E. Zanchetta, M. Baroni, and S. Bernardini. 2008. Introducing and evaluating "ukwac", a very large web-derived corpus of English. In Proc. WAC4 Workshop at LREC, Marrakech, Morocco.
-
(2008)
Proc. WAC4 Workshop at LREC
-
-
Ferraresi, A.1
Zanchetta, E.2
Baroni, M.3
Bernardini, S.4
-
8
-
-
20344402818
-
Building minority language corpora by learning to generate web search queries
-
Rayid Ghani, Rosie Jones, and Dunja Mladenic. 2005. Building minority language corpora by learning to generate web search queries. Knowledge and Information Systems, 7(1):56-83.
-
(2005)
Knowledge and Information Systems
, vol.7
, Issue.1
, pp. 56-83
-
-
Ghani, R.1
Jones, R.2
Mladenic, D.3
-
9
-
-
0345138552
-
Estimation of english and non-English language use on the WWW
-
Gregory Grefenstette and Julien Nioche. 2000. Estimation of english and non-english language use on the WWW. In Proc. RIAO, pages 237-246.
-
(2000)
Proc. RIAO
, pp. 237-246
-
-
Grefenstette, G.1
Nioche, J.2
-
10
-
-
33846277160
-
The American national corpus: More than the web can provide
-
Las Palmas
-
Nancy Ide, Randi Reppen, and Keith Suderman. 2002. The American National Corpus: More than the web can provide. In Proc. LREC, pages 839-844, Las Palmas.
-
(2002)
Proc. LREC
, pp. 839-844
-
-
Ide, N.1
Reppen, R.2
Suderman, K.3
-
11
-
-
0344276035
-
Automatically building a corpus for a minority language from the web
-
Rosie Jones and Rayid Ghani. 2000. Automatically building a corpus for a minority language from the web. In Proc. ACL Student Workshop, pages 29-36.
-
(2000)
Proc. ACL Student Workshop
, pp. 29-36
-
-
Jones, R.1
Ghani, R.2
-
12
-
-
0344154400
-
Using the web to obtain frequencies for unseen bigrams
-
Frank Keller and Mirella Lapata. 2003. Using the web to obtain frequencies for unseen bigrams. Computational. Linguistics., 29(3):459-484.
-
(2003)
Computational. Linguistics
, vol.29
, Issue.3
, pp. 459-484
-
-
Keller, F.1
Lapata, M.2
-
13
-
-
0345570091
-
Lexical profiling software and its lexicographic applications: A case study
-
Copenhagen
-
Adam Kilgarriff and Michael Rundell. 2002. Lexical profiling software and its lexicographic applications: a case study. In Proc. EURALEX, pages 807-818, Copenhagen.
-
(2002)
Proc. EURALEX
, pp. 807-818
-
-
Kilgarriff, A.1
Rundell, M.2
-
14
-
-
33750691522
-
The sketch engine
-
Lorient, France
-
Adam Kilgarriff, Pavel Rychly, Pavel Smrz, and David Tugwell. 2004. The sketch engine. In Proc. EURALEX, pages 105-116, Lorient, France.
-
(2004)
Proc. EURALEX
, pp. 105-116
-
-
Kilgarriff, A.1
Rychly, P.2
Smrz, P.3
Tugwell, D.4
-
17
-
-
84909993491
-
Detecting co-derivative documents in large text collections
-
Marrakech, Morocco
-
Jan Pomikálek and Pavel Rychlý. 2008. Detecting co-derivative documents in large text collections. In Proc. LREC, Marrakech, Morocco.
-
(2008)
Proc. LREC
-
-
Pomikálek, J.1
Rychlý, P.2
-
19
-
-
85075757425
-
Mining the web for bilingual text
-
Philip Resnik. 1999. Mining the web for bilingual text. In Proc. ACL, pages 527-534.
-
(1999)
Proc. ACL
, pp. 527-534
-
-
Resnik, P.1
-
20
-
-
41149154578
-
The crubadan project: Corpus building for under-resourced languages
-
Louvain-la-Neuve, Belgium
-
Kevin P. Scannell. 2007. The crubadan project: Corpus building for under-resourced languages. In Proc. WAC-3: Building and Exploring Web Corpora, Louvain-la-Neuve, Belgium.
-
(2007)
Proc. WAC-3: Building and Exploring Web Corpora
-
-
Scannell, K.P.1
-
21
-
-
42649127636
-
Creating general-purpose corpora using automated search engine queries
-
Gedit
-
Serge Sharoff. 2006. Creating general-purpose corpora using automated search engine queries. In WaCky! Working papers on the Web as Corpus. Gedit.
-
(2006)
WaCky! Working Papers on the Web as Corpus
-
-
Sharoff, S.1
|