-
1
-
-
85015762649
-
Language Identification from Text Using n-Gram Based Cumulative Frequency Addition
-
CSIS, Pace University, May 7th, 2004
-
Ahmed B, Cha SH, Tappert C (2004). "Language Identification from Text Using n-Gram Based Cumulative Frequency Addition." In Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004. URL http://www.csis.pace.edu/~ctappert/srd2004/paper12.pdf.
-
(2004)
In Proceedings of Student/Faculty Research Day
-
-
Ahmed, B.1
Cha, S.H.2
Tappert, C.3
-
3
-
-
84868195813
-
Data in Your Language: The ECI Multilingual Corpus 1
-
Nara, Japan
-
Armstrong-Warwick S, Thompson HS, McKelvie D, Petitpierre D (1994). "Data in Your Language: The ECI Multilingual Corpus 1." In Proceedings of the International Work- shop on Sharable Natural Language Resources, pp. 97-106. Nara, Japan. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.950.
-
(1994)
Proceedings of the International Work- Shop on Sharable Natural Language Resources
, pp. 97-106
-
-
Armstrong-Warwick, S.1
Thompson, H.S.2
McKelvie, D.3
Petitpierre, D.4
-
9
-
-
0003984557
-
-
Technical Report MCCS 94-273, Computing Research Lab (CRL), New Mexico State University. URL
-
Dunning T (1994). "Statistical Identification of Language." Technical Report MCCS 94-273, Computing Research Lab (CRL), New Mexico State University. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.1958.
-
(1994)
Statistical Identification of Language
-
-
Dunning, T.1
-
10
-
-
0038217041
-
The Distribution of N-Grams
-
Egghe L (2000). "The Distribution of N-Grams." Scientometrics, 47(2), 237-252.
-
(2000)
Scientometrics
, vol.47
, Issue.2
, pp. 237-252
-
-
Egghe, L.1
-
11
-
-
46749110035
-
Text Mining Infrastructure in R
-
Feinerer I, Hornik K, Meyer D (2008)."Text Mining Infrastructure in R."Journal of Statistical Software, 25(5), 1-54. URL http://www.jstatsoft.org/v25/i05/.
-
(2008)
Journal of Statistical Software
, vol.25
, Issue.5
, pp. 1-54
-
-
Feinerer, I.1
Hornik, K.2
Meyer, D.3
-
13
-
-
75149156482
-
Extending Zipf's Law to n-Grams for Large Corpora
-
Ha LQ, Hanna P, Ming J, Smith FJ (2009). "Extending Zipf's Law to n-Grams for Large Corpora." Artificial Intelligence Review, 32(1-4), 101-113.
-
(2009)
Artificial Intelligence Review
, vol.32
, Issue.1-4
, pp. 101-113
-
-
Ha, L.Q.1
Hanna, P.2
Ming, J.3
Smith, F.J.4
-
14
-
-
85134891245
-
Language Identification for the Automatic Grapheme-to-Phoneme Conversion of Foreign Words in a German Text-to-Speech System
-
(First European Conference on Speech Communication and Technology)
-
Henrich P (1989). "Language Identification for the Automatic Grapheme-to-Phoneme Conversion of Foreign Words in a German Text-to-Speech System." In EUROSPEECH-1989 (First European Conference on Speech Communication and Technology), pp. 2220-2223. URL http://www.isca-speech.org/archive/eurospeech_1989/e89_2220.html.
-
(1989)
EUROSPEECH-1989
, pp. 2220-2223
-
-
Henrich, P.1
-
16
-
-
0345555473
-
A Language Identification Table
-
Ingle NC (1976)."A Language Identification Table."The Incorporated Linguist, 15(4), 98-101.
-
(1976)
The Incorporated Linguist
, vol.15
, Issue.4
, pp. 98-101
-
-
Ingle, N.C.1
-
18
-
-
58149462758
-
A Machine Learning Approach for Arabic Text Classification Using n-Gram Frequency Statistics
-
doi:10.1016/j.joi. 2008.11.005
-
Khreisat L (2009). "A Machine Learning Approach for Arabic Text Classification Using n-Gram Frequency Statistics." Journal of Informetrics, 3(1), 72-77. doi:10.1016/j.joi. 2008.11.005.
-
(2009)
Journal of Informetrics
, vol.3
, Issue.1
, pp. 72-77
-
-
Khreisat, L.1
-
19
-
-
48349136970
-
Language Identification: How to Distinguish Similar Languages
-
In V Lužar-Stifter, VH Dobrić (eds.), SRCE University Com-puting Centre, Zagreb. URL
-
Ljubešić N, Mikelić N, Boras D (2007). "Language Identification: How to Distinguish Similar Languages." In V Lužar-Stifter, VH Dobrić (eds.), Proceedings of the 29th International Conference on Information Technology Interfaces, pp. 541-546. SRCE University Com-puting Centre, Zagreb. URL http://www.nljubesic.net/main/publications_files/ljubesic07-language.pdf.
-
(2007)
Proceedings of the 29th International Conference on Information Technology Interfaces
, pp. 541-546
-
-
Ljubešić, N.1
Mikelić, N.2
Boras, D.3
-
20
-
-
67650083674
-
-
Technical report, Cavendish Labo-ratory, Cambridge, The Inference Group. URL
-
Murray IA (2002). "Probabilistic Language Modelling." Technical report, Cavendish Labo-ratory, Cambridge, The Inference Group. URL http://www.inference.phy.cam.ac.uk/is/papers/langreport.pdf.
-
(2002)
Probabilistic Language Modelling
-
-
Murray, I.A.1
-
21
-
-
24744447069
-
Multiple Discriminant Analysis in Linguistic Problems
-
Stockholm
-
Mustonen S (1965). "Multiple Discriminant Analysis in Linguistic Problems." Statistical Methods in Linguistics, 4, 37-44. Stockholm.
-
(1965)
Statistical Methods in Linguistics
, vol.4
, pp. 37-44
-
-
Mustonen, S.1
-
23
-
-
84863304598
-
-
R Core Team, R Foun-dation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
-
R Core Team (2012). R: A Language and Environment for Statistical Computing. R Foun-dation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/.
-
(2012)
R: A Language and Environment for Statistical Computing
-
-
-
24
-
-
41149154578
-
The Crúbadán Project: Corpus building for under-resourced languages
-
In C Fairon, H Naets, A Kilgarriff, GM de Schryver (eds.), Presses universitaires de Louvain, Louvain-la-Neuve, Belgium. URL
-
Scannell KP (2007)."The Crúbadán Project: Corpus building for under-resourced languages." In C Fairon, H Naets, A Kilgarriff, GM de Schryver (eds.), Building and Exploring Web Corpora: Proceedings of the 3rd Web as Corpus Workshop, volume 4 of Cahiers du Cental, pp. 5-15. Presses universitaires de Louvain, Louvain-la-Neuve, Belgium. URL http://borel.slu.edu/pub/wac3.pdf.
-
(2007)
Building and Exploring Web Corpora: Proceedings of the 3rd Web as Corpus Workshop, Volume 4 of Cahiers Du Cental
, pp. 5-15
-
-
Scannell, K.P.1
-
25
-
-
0004137163
-
Language Identification: Examining the Issues
-
Las Vegas, Nevada, U.S.A. URL
-
Sibun P, Reynar JC (1996). "Language Identification: Examining the Issues." In 5th Sympo-sium on Document Analysis and Information Retrieval, pp. 125-135. Las Vegas, Nevada, U.S.A. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.4524.
-
(1996)
5th Sympo-sium on Document Analysis and Information Retrieval
, pp. 125-135
-
-
Sibun, P.1
Reynar, J.C.2
-
26
-
-
78651342601
-
Study of Some Distance Measures for Language and Encoding Identification
-
Sydney, July 2006
-
Singh AK (2006). "Study of Some Distance Measures for Language and Encoding Identification." In Proceedings in the Workshop on Linguistic Distances, Sydney, July 2006, pp. 63-72. URL http://acl.ldc.upenn.edu/W/W06/W06-1109.pdf.
-
(2006)
Proceedings in the Workshop on Linguistic Distances
, pp. 63-72
-
-
Singh, A.K.1
-
27
-
-
85119187881
-
Natural Language Identification Using Corpus-Based Models
-
Souter C, Churcher G, Hayes J, Hughes J, Johnson S (1994)."Natural Language Identification Using Corpus-Based Models." Hermes Journal of Linguistics, 13, 183-203. URL http://download2.hermes.asb.dk/archive/FreeH/H13_15.pdf.
-
(1994)
Hermes Journal of Linguistics
, vol.13
, pp. 183-203
-
-
Souter, C.1
Churcher, G.2
Hayes, J.3
Hughes, J.4
Johnson, S.5
-
28
-
-
20344398381
-
-
van Noord G (1997). "TextCat." URL http://odur.let.rug.nl/~vannoord/TextCat.
-
(1997)
TextCat
-
-
van Noord, G.1
-
29
-
-
84855721130
-
-
Wikipedia, accessed 2013-01-15
-
Wikipedia (2013a). "n-Gram - Wikipedia, The Free Encyclopedia." URL http://en.wikipedia.org/wiki/N-gram, accessed 2013-01-15.
-
(2013)
N-Gram - Wikipedia, the Free Encyclopedia
-
-
-
31
-
-
84855721130
-
-
Wikipedia, accessed 2013-01-15
-
Wikipedia (2013c). "XPath - Wikipedia, The Free Encyclopedia." URL http://en.wikipedia.org/wiki/XPath, accessed 2013-01-15.
-
(2013)
XPath - Wikipedia, the Free Encyclopedia
-
-
|