-
1
-
-
0242477096
-
Language identifier: A computer program for automatic natural-language identification of on-line text
-
Kenneth Beesley. 1988. Language identifier: A computer program for automatic natural-language identification of on-line text. In Proceedings of the 29th ATA Annual Conference, pages 47-54.
-
(1988)
Proceedings of the 29th ATA Annual Conference
, pp. 47-54
-
-
Beesley, K.1
-
2
-
-
84861638618
-
Factors that affect the accuracy of text-based language identification
-
Gerrit Reinier Botha and Etienne Barnard. 2007. Factors that affect the accuracy of text-based language identification. In Proceedings of PRASA 2007, pages 7-10.
-
(2007)
Proceedings of PRASA 2007
, pp. 7-10
-
-
Botha, G.R.1
Barnard, E.2
-
4
-
-
0033329799
-
An empirical study of smoothing techniques for language modeling
-
Stanley F. Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4):359-393.
-
(1999)
Computer Speech & Language
, vol.13
, Issue.4
, pp. 359-393
-
-
Chen, S.F.1
Goodman, J.2
-
5
-
-
0003603515
-
-
Cambridge University Press. Web Edition in PDF
-
Ron Cole, Joseph Mariani, Hans Uszkoreit, Giovanni Varile, Annie Zaenen, Victor Zue, and Antonio Zampolli, editors. 1997. Survey of the State of the Art in Human Language Technology. Cambridge University Press. Web Edition in PDF.
-
(1997)
Survey of the State of the Art in Human Language Technology
-
-
Cole, R.1
Mariani, J.2
Uszkoreit, H.3
Varile, G.4
Zaenen, A.5
Zue, V.6
Zampolli, A.7
-
6
-
-
38849173179
-
Identification of document language is not yet a completely solved problem
-
Joaquim Ferreira da Silva and Gabriel Pereira Lopes. 2006. Identification of document language is not yet a completely solved problem. In Proceedings of CIMCA'06, pages 212-219.
-
(2006)
Proceedings of CIMCA'06
, pp. 212-219
-
-
Da Silva, J.F.1
Lopes, G.P.2
-
7
-
-
0028911698
-
Gauging similarity with n-grams: Language-independent categorization of text
-
Marc Damashek. 1995. Gauging similarity with n-grams: Language-independent categorization of text. Science, 267(5199):843-849.
-
(1995)
Science
, vol.267
, Issue.5199
, pp. 843-849
-
-
Damashek, M.1
-
8
-
-
17444437850
-
Confidence scoring based on backward language models
-
Jacques Duchateau, Kris Demuynck, and Patrick Wambacq. 2002. Confidence scoring based on backward language models. In Proceedings of ICASSP 2002, Volume 1, pages 221-224.
-
(2002)
Proceedings of ICASSP 2002
, vol.1
, pp. 221-224
-
-
Duchateau, J.1
Demuynck, K.2
Wambacq, P.3
-
10
-
-
0000803388
-
The population frequencies of species and the estimation of population parameters
-
I. J. Good. 1953. The population frequencies of species and the estimation of population parameters. Biometrika, 40:237-264.
-
(1953)
Biometrika
, vol.40
, pp. 237-264
-
-
Good, I.J.1
-
11
-
-
84945903856
-
Language model size reduction by pruning and clustering
-
Joshua Goodman and Jianfeng Gao. 2000. Language model size reduction by pruning and clustering. In Proceedings of ICSLP, pages 16-20.
-
(2000)
Proceedings of ICSLP
, pp. 16-20
-
-
Goodman, J.1
Gao, J.2
-
14
-
-
0001152481
-
Toward automatic identification of the language of an utterance. I. Preliminary methodological considerations
-
Arthur S. House and Edward P. Neuburg. 1977. Toward automatic identification of the language of an utterance. I. Preliminary methodological considerations. Journal of the Acoustical Society of America, 62(3):708-713.
-
(1977)
Journal of the Acoustical Society of America
, vol.62
, Issue.3
, pp. 708-713
-
-
House, A.S.1
Neuburg, E.P.2
-
15
-
-
0023312404
-
Estimation of probabilities from sparse data for the language model component of a speech recognizer
-
Slava M. Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35:400-401.
-
(1987)
IEEE Transactions on Acoustics, Speech, and Signal Processing
, vol.35
, pp. 400-401
-
-
Katz, S.M.1
-
16
-
-
0028996876
-
Improved backing-off for m-gram language modeling
-
Reinhard Kneser and Hermann Ney. 1995. Improved backing-off for m-gram language modeling. In Proceedings of ICASSP-95, pages 181-184.
-
(1995)
Proceedings of ICASSP-95
, pp. 181-184
-
-
Kneser, R.1
Ney, H.2
-
18
-
-
33749648183
-
Language identification: A solved problem suitable for undergraduate instruction
-
Paul McNamee. 2005. Language identification: a solved problem suitable for undergraduate instruction. Journal of Computing Sciences in Colleges, 20(3):94-101.
-
(2005)
Journal of Computing Sciences in Colleges
, vol.20
, Issue.3
, pp. 94-101
-
-
McNamee, P.1
-
19
-
-
0027929445
-
On structuring probabilistic dependence in stochastic language modelling
-
Hermann Ney, Ute Essen, and Reinhard Kneser. 1994. On structuring probabilistic dependence in stochastic language modelling. Computer Speech and Language, 8(1):1-38.
-
(1994)
Computer Speech and Language
, vol.8
, Issue.1
, pp. 1-38
-
-
Ney, H.1
Essen, U.2
Kneser, R.3
-
20
-
-
67650535508
-
Language identification on the web: Extending the dictionary method
-
Radim Řehůřek and Milan Kolkus. 2009. Language identification on the web: Extending the dictionary method. In Proceedings of CICLing 2009, pages 357-368.
-
(2009)
Proceedings of CICLing 2009
, pp. 357-368
-
-
Řehůřek, R.1
Kolkus, M.2
-
21
-
-
0004137163
-
Language identification: Examining the issues
-
Penelope Sibun and Jeffrey C. Reynar. 1996. Language identification: Examining the issues. In Proceedings of SDAIR'96, pages 125-135.
-
(1996)
Proceedings of SDAIR'96
, pp. 125-135
-
-
Sibun, P.1
Reynar, J.C.2
-
22
-
-
58349107420
-
On growing and pruning kneser-ney smoothed n-gram models
-
Vesa Siivola, Teemu Hirsimäki, and Sami Virpioja. 2007. On growing and pruning Kneser-Ney smoothed n-gram models. IEEE Transactions on Audio, Speech & Language Processing, 15(5):1617-1624.
-
(2007)
IEEE Transactions on Audio, Speech & Language Processing
, vol.15
, Issue.5
, pp. 1617-1624
-
-
Siivola, V.1
Hirsimäki, T.2
Virpioja, S.3
-
24
-
-
84891308106
-
SRILM - An extensible language modeling toolkit
-
Andreas Stolcke. 2002. SRILM - an extensible language modeling toolkit. In Proceedings of ICSLP, pages 901-904. http://www.speech.sri.com/projects/srilm/.
-
(2002)
Proceedings of ICSLP
, pp. 901-904
-
-
Stolcke, A.1
-
25
-
-
1542310280
-
Text classification and segmentation using minimum cross-entropy
-
William John Teahan. 2000. Text classification and segmentation using minimum cross-entropy. In Proceedings of RIAO'00, pages 943-961.
-
(2000)
Proceedings of RIAO'00
, pp. 943-961
-
-
Teahan, W.J.1
-
27
-
-
0034788435
-
A study of smoothing methods for language models applied to ad hoc information retrieval
-
Chengxiang Zhai and John Lafferty. 2001. A study of smoothing methods for language models applied to Ad Hoc information retrieval. In Proceedings of SIGIR'01, pages 334-342.
-
(2001)
Proceedings of SIGIR'01
, pp. 334-342
-
-
Zhai, C.1
Lafferty, J.2
|