-
1
-
-
85000897241
-
The SETimes.HR linguistically annotated corpus of Croatian
-
[Agić and Ljubešić2014]
-
[Agić and Ljubešić2014] Željko Agić and Nikola Ljubešić. 2014. The SETimes.HR linguistically annotated corpus of Croatian. In Proceedings of LREC 2014.
-
(2014)
Proceedings of LREC 2014
-
-
Agić, Željko1
Ljubešić, Nikola2
-
2
-
-
85121801634
-
Lemmatization and morphosyntactic tagging of Croatian and Serbian
-
[Agić et al.2013a] a pages Sofia, Bulgaria, August. Association for Computational Linguistics
-
[Agić et al.2013a] Željko Agić, Nikola Ljubešić, and Danijela Merkler. 2013a. Lemmatization and morphosyntactic tagging of Croatian and Serbian. In Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing, pages 48–57, Sofia, Bulgaria, August. Association for Computational Linguistics.
-
(2013)
Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing
, pp. 48-57
-
-
Agić, Željko1
Ljubešić, Nikola2
Merkler, Danijela3
-
4
-
-
85037352302
-
Cleaneval: a competition for cleaning web pages
-
[Baroni et al.2008] Marrakech, Morocco. European Language Resources Association (ELRA)
-
[Baroni et al.2008] Marco Baroni, Francis Chantree, Adam Kilgarriff, and Serge Sharoff. 2008. Cleaneval: a competition for cleaning web pages. In Proceedings of the Sixth International Language Resources and Evaluation (LREC’08), Marrakech, Morocco. European Language Resources Association (ELRA).
-
(2008)
Proceedings of the Sixth International Language Resources and Evaluation (LREC’08)
-
-
Baroni, Marco1
Chantree, Francis2
Kilgarriff, Adam3
Sharoff, Serge4
-
7
-
-
85106746987
-
HunPos: an open source trigram tagger
-
[Halácsy et al.2007] pages Stroudsburg, PA, USA. Association for Computational Linguistics
-
[Halácsy et al.2007] Péter Halácsy, András Kornai, and Csaba Oravecz. 2007. HunPos: an open source trigram tagger. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ACL’07, pages 209–212, Stroudsburg, PA, USA. Association for Computational Linguistics.
-
(2007)
Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ACL’07
, pp. 209-212
-
-
Halácsy, Péter1
Kornai, András2
Oravecz, Csaba3
-
9
-
-
77950904942
-
Boilerplate detection using shallow text features
-
[Kohlschütter et al.2010] Brian D. Davison, Torsten Suel, Nick Craswell, and Bing Liu, editors, pages ACM
-
[Kohlschütter et al.2010] Christian Kohlschütter, Peter Fankhauser, and Wolfgang Nejdl. 2010. Boilerplate detection using shallow text features. In Brian D. Davison, Torsten Suel, Nick Craswell, and Bing Liu, editors, WSDM, pages 441–450. ACM.
-
(2010)
WSDM
, pp. 441-450
-
-
Kohlschütter, Christian1
Fankhauser, Peter2
Nejdl, Wolfgang3
-
10
-
-
80052773421
-
hrWaC and slWac: Compiling Web Corpora for Croatian and Slovene
-
[Ljubešić and Erjavec2011] pages Springer
-
[Ljubešić and Erjavec2011] Nikola Ljubešić and Tomaž Erjavec. 2011. hrWaC and slWac: Compiling Web Corpora for Croatian and Slovene. In Text, Speech and Dialogue - 14th International Conference, TSD 2011, Pilsen, Czech Republic, Lecture Notes in Computer Science, pages 395–402. Springer.
-
(2011)
Text, Speech and Dialogue - 14th International Conference, TSD 2011, Pilsen, Czech Republic, Lecture Notes in Computer Science
, pp. 395-402
-
-
Ljubešić, Nikola1
Erjavec, Tomaž2
-
11
-
-
85118481535
-
langid.py: An off-the-shelf language identification tool
-
[Lui and Baldwin2012] pages
-
[Lui and Baldwin2012] Marco Lui and Timothy Baldwin. 2012. langid.py: An off-the-shelf language identification tool. In ACL (System Demonstrations), pages 25–30.
-
(2012)
ACL (System Demonstrations)
, pp. 25-30
-
-
Lui, Marco1
Baldwin, Timothy2
-
15
-
-
84897949455
-
Efficient web crawling for large text corpora
-
[Suchomel and Pomikálek2012] Serge Sharoff Adam Kilgarriff, editor, pages Lyon
-
[Suchomel and Pomikálek2012] Vít Suchomel and Jan Pomikálek. 2012. Efficient web crawling for large text corpora. In Serge Sharoff Adam Kilgarriff, editor, Proceedings of the seventh Web as Corpus Workshop (WAC7), pages 39–43, Lyon.
-
(2012)
Proceedings of the seventh Web as Corpus Workshop (WAC7)
, pp. 39-43
-
-
Suchomel, Vít1
Pomikálek, Jan2
-
16
-
-
84876815126
-
Efficient discrimination between closely related languages
-
[Tiedemann and Ljubešić2012] pages Mumbai, India
-
[Tiedemann and Ljubešić2012] Jörg Tiedemann and Nikola Ljubešić. 2012. Efficient discrimination between closely related languages. In Proceedings of COLING 2012, pages 2619–2634, Mumbai, India.
-
(2012)
Proceedings of COLING 2012
, pp. 2619-2634
-
-
Tiedemann, Jörg1
Ljubešić, Nikola2
|