-
2
-
-
70350686154
-
The wacky wide web: A collection of very large linguistically processed web-crawled corpora
-
M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. 2009. The wacky wide web: A collection of very large linguistically processed web-crawled corpora. Journal of Language Resources and Evaluation, 43(3):209-226.
-
(2009)
Journal of Language Resources and Evaluation
, vol.43
, Issue.3
, pp. 209-226
-
-
Baroni, M.1
Bernardini, S.2
Ferraresi, A.3
Zanchetta, E.4
-
4
-
-
60950417505
-
How random is a corpus? The library metaphor
-
S. Evert. 2006. How random is a corpus? The library metaphor. Zeitschrift für Anglistik und Amerikanistik, 54(2):177-190.
-
(2006)
Zeitschrift für Anglistik und Amerikanistik
, vol.54
, Issue.2
, pp. 177-190
-
-
Evert, S.1
-
5
-
-
84869471345
-
A lightweight and efficient tool for cleaning web pages
-
Nicoletta Calzolari et a., editor Marrakech, Morocco. European Language Resources Association (ELRA)
-
S. Evert. 2008. A lightweight and efficient tool for cleaning web pages. In Nicoletta Calzolari et a., editor, Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2008/.
-
(2008)
Proceedings of the Sixth International Language Resources and Evaluation (LREC'08)
-
-
Evert, S.1
-
7
-
-
84904706989
-
From D-coi to SoNaR: A reference corpus for Dutch
-
Nicoletta Calzolari et al., editor Marrakech, Morocco. European Language Resources Association (ELRA)
-
N. Oostdijk, M. Reynaert, P. Monachesi, G. Van Noord, R. Ordelman, I. Schuurman, and V. Vandeghinste. 2008. From D-Coi to SoNaR: a reference corpus for Dutch. In Nicoletta Calzolari et al., editor, Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2008/.
-
(2008)
Proceedings of the Sixth International Language Resources and Evaluation (LREC'08)
-
-
Oostdijk, N.1
Reynaert, M.2
Monachesi, P.3
Van Noord, G.4
Ordelman, R.5
Schuurman, I.6
Vandeghinste, V.7
-
8
-
-
49949085967
-
Non-interactive OCR post-correction for giga-scale digitization projects
-
A. Gelbukh, editor Lecture Notes in Computer Science Vol. 4919/2008 Berlin / Heidelberg. Springer
-
M. Reynaert. 2008. Non-interactive OCR post-correction for giga-scale digitization projects. In A. Gelbukh, editor, Proceedings of the Computational Linguistics and Intelligent Text Processing 9th International Conference, CICLing 2008. Lecture Notes in Computer Science Vol. 4919/2008, pages 617-630, Berlin / Heidelberg. Springer.
-
(2008)
Proceedings of the Computational Linguistics and Intelligent Text Processing 9th International Conference, CICLing 2008
, pp. 617-630
-
-
Reynaert, M.1
-
9
-
-
33748650310
-
Orthographic errors in web pages: Toward cleaner web corpora
-
C. Ringlstetter, K. Schulz, and S. Mihov. 2006. Orthographic errors in web pages: Toward cleaner web corpora. Computational Linguistics, 32(3):295-340.
-
(2006)
Computational Linguistics
, vol.32
, Issue.3
, pp. 295-340
-
-
Ringlstetter, C.1
Schulz, K.2
Mihov, S.3
-
10
-
-
84892150311
-
Interacting semantic layers of annotation in SoNaR, a reference corpus of contemporary written Dutch
-
Valletta, Malta
-
I. Schuurman, V. Hoste, and P. Monachesi. 2010. Interacting Semantic Layers of Annotation in SoNaR, a Reference Corpus of Contemporary Written Dutch. In Proceedings of the Seventh International Conference on Linguistic Resources and Evaluation (LREC-2010), Valletta, Malta.
-
(2010)
Proceedings of the Seventh International Conference on Linguistic Resources and Evaluation (LREC-2010)
-
-
Schuurman, I.1
Hoste, V.2
Monachesi, P.3
|