-
2
-
-
68849129046
-
Using foreign inclusion detection to improve parsing performance
-
Prague, Czech Republic
-
Beatrice Alex, Amit Dubey, and Frank Keller. 2007. Using foreign inclusion detection to improve parsing performance. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning 2007 (EMNLPCoNLL 2007), pages 151-160, Prague, Czech Republic.
-
(2007)
Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning 2007 (EMNLPCoNLL 2007)
, pp. 151-160
-
-
Alex, Beatrice1
Dubey, Amit2
Keller, Frank3
-
9
-
-
0028911698
-
Gauging similarity with n-grams: Language-independent categorization of text
-
Marc Darnashek. 1995. Gauging similarity with n-grams: Language-independent categorization of text. Science, 267:843-848.
-
(1995)
Science
, vol.267
, pp. 843-848
-
-
Darnashek, Marc1
-
13
-
-
0003984557
-
-
Technical Report MCCS 940-273, Computing Research Laboratory, New Mexico State University
-
Ted Dunning. 1994. Statistical identification of language. Technical Report MCCS 940-273, Computing Research Laboratory, New Mexico State University.
-
(1994)
Statistical identification of language
-
-
Dunning, Ted1
-
14
-
-
2942731012
-
An Extensive Empirical Study of Feature Selection Metrics for Text Classification
-
George Forman. 2003. An Extensive Empirical Study of Feature Selection Metrics for Text Classification. Journal of Machine Learning Research, 3(7-8):1289-1305.
-
(2003)
Journal of Machine Learning Research
, vol.3
, Issue.7-8
, pp. 1289-1305
-
-
Forman, George1
-
15
-
-
20344402818
-
Building Minority Language Corpora by Learning to Generate Web Search Queries
-
Rayid Ghani, Rosie Jones, and Dunja Mladenic. 2004. Building Minority Language Corpora by Learning to Generate Web Search Queries. Knowledge and Information Systems, 7(1):56-83.
-
(2004)
Knowledge and Information Systems
, vol.7
, Issue.1
, pp. 56-83
-
-
Ghani, Rayid1
Jones, Rosie2
Mladenic, Dunja3
-
17
-
-
49949150022
-
Language identification in the limit
-
E. Mark Gold. 1967. Language identification in the limit. Information and Control, 5:447-474.
-
(1967)
Information and Control
, vol.5
, pp. 447-474
-
-
Mark Gold, E.1
-
20
-
-
85039912916
-
Reconsidering language identification for written language resources
-
Genoa, Italy
-
Baden Hughes, Timothy Baldwin, Steven Bird, Jeremy Nicholson, and Andrew MacKinlay. 2006. Reconsidering language identification for written language resources. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), pages 485-488, Genoa, Italy.
-
(2006)
Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006)
, pp. 485-488
-
-
Hughes, Baden1
Baldwin, Timothy2
Bird, Steven3
Nicholson, Jeremy4
MacKinlay, Andrew5
-
23
-
-
33750133228
-
Language identification based on string kernels
-
Beijing, China
-
Canasai Kruengkrai, Prapass Srichaivattana, Virach Sornlertlamvanich, and Hitoshi Isahara. 2005. Language identification based on string kernels. In Proceedings of the 5th International Symposium on Communications and Information Technologies (ISCIT-2005), pages 896-899, Beijing, China.
-
(2005)
Proceedings of the 5th International Symposium on Communications and Information Technologies (ISCIT-2005)
, pp. 896-899
-
-
Kruengkrai, Canasai1
Srichaivattana, Prapass2
Sornlertlamvanich, Virach3
Isahara, Hitoshi4
-
29
-
-
3843127500
-
Character Ngram Tokenization for European Language Text Retrieval
-
Paul McNamee and James Mayfield. 2004. Character Ngram Tokenization for European Language Text Retrieval. Information Retrieval, 7(1-2):73-97.
-
(2004)
Information Retrieval
, vol.7
, Issue.1-2
, pp. 73-97
-
-
McNamee, Paul1
Mayfield, James2
-
31
-
-
33744584654
-
Induction of Decision Trees
-
October
-
J.R. Quinlan. 1986. Induction of Decision Trees. Machine Learning, 1(1):81-106, October.
-
(1986)
Machine Learning
, vol.1
, Issue.1
, pp. 81-106
-
-
Quinlan, J.R.1
-
32
-
-
85037539156
-
The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages
-
Geona, Italy
-
Ralf Steinberger, Bruno Pouliquen, Anna Widiger, Camelia Ignat, Tomaž Erjavec, Dan Tufis, and Dániel Varga. 2006. The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006), Geona, Italy.
-
(2006)
Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006)
-
-
Steinberger, Ralf1
Pouliquen, Bruno2
Widiger, Anna3
Ignat, Camelia4
Erjavec, Tomaž5
Tufis, Dan6
Varga, Dániel7
-
33
-
-
20344398381
-
-
Software
-
Gertjan van Noord, 1997. TextCat. Software available at http://odur.let.rug.nl/~vannoord/TextCat/.
-
(1997)
TextCat
-
-
van Noord, Gertjan1
-
34
-
-
84858381103
-
Applying NLP technologies to the collection and enrichment of language data on the web to aid linguistic research
-
Athens, Greece
-
Fei Xia and William Lewis. 2009. Applying NLP technologies to the collection and enrichment of language data on the web to aid linguistic research. In Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education (LaTeCH - SHELT&R 2009), pages 51-59, Athens, Greece.
-
(2009)
Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education (LaTeCH - SHELT&R 2009)
, pp. 51-59
-
-
Xia, Fei1
Lewis, William2
-
35
-
-
84858393694
-
Language ID in the context of harvesting language data off the web
-
Athens, Greece
-
Fei Xia, William Lewis, and Hoifung Poon. 2009. Language ID in the context of harvesting language data off the web. In Proceedings of the 12th Conference of the EACL (EACL 2009), pages 870-878, Athens, Greece.
-
(2009)
Proceedings of the 12th Conference of the EACL (EACL 2009)
, pp. 870-878
-
-
Xia, Fei1
Lewis, William2
Poon, Hoifung3
|