-
4
-
-
23744496316
-
Text classification and multilinguism: Getting at words vian-grams of characters
-
July
-
I. Biskri and S. Delisle. Text classification and multilinguism: Getting at words vian-grams of characters. In Proceedings of SCI-2002, 6th World Multiconference on Systemics, Cybernetics and Informatics, volume 5, pages 110-115, July 2002.
-
(2002)
Proceedings of SCI-2002, 6th World Multiconference on Systemics, Cybernetics and Informatics
, vol.5
, pp. 110-115
-
-
Biskri, I.1
Delisle, S.2
-
6
-
-
0002636321
-
N-gram-based text categorization
-
Las Vegas, Nevada, U.S.A.
-
W. B. Cavnar and J. M. Trenkle. N-gram-based text categorization. In Proceedings of SDAIR-94, the 3rd Annual Symposium on Document Analysis and Information Retrieval, pages 161-175, Las Vegas, Nevada, U.S.A. 1994.
-
(1994)
Proceedings of SDAIR-94, the 3rd Annual Symposium on Document Analysis and Information Retrieval
, pp. 161-175
-
-
Cavnar, W.B.1
Trenkle, J.M.2
-
7
-
-
0032663163
-
Mining the Web's link structure
-
August
-
S. Chakrabarti, B. E. Dom, S. R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and J. Kleinberg. Mining the Web's link structure. Computer, 32(8):60-67, August 1999.
-
(1999)
Computer
, vol.32
, Issue.8
, pp. 60-67
-
-
Chakrabarti, S.1
Dom, B.E.2
Kumar, S.R.3
Raghavan, P.4
Rajagopalan, S.5
Tomkins, A.6
Gibson, D.7
Kleinberg, J.8
-
8
-
-
85024115120
-
An empirical study of smoothing techniques for language modeling
-
A. Joshi and M. Palmer, editors, San Francisco. Morgan Kaufmann Publishers
-
S. F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. In A. Joshi and M. Palmer, editors, Proceedings of ACL-96, the 34th Annual Meeting of the Association for Computational Linguistics, pages 310-318, San Francisco, 1996. Morgan Kaufmann Publishers.
-
(1996)
Proceedings of ACL-96, the 34th Annual Meeting of the Association for Computational Linguistics
, pp. 310-318
-
-
Chen, S.F.1
Goodman, J.2
-
10
-
-
0028911698
-
Gauging similarity with n-grams: Language independent categorization of text
-
M. Darnashek. Gauging similarity with n-grams: language independent categorization of text. Science. 267(5199):843-848, 1995.
-
(1995)
Science
, vol.267
, Issue.5199
, pp. 843-848
-
-
Darnashek, M.1
-
11
-
-
0003984557
-
Statistical identification of language
-
New Mexico State University
-
T. Dunning. Statistical identification of language. Technical Report MCCS 94-273, New Mexico State University, 1994.
-
(1994)
Technical Report
, vol.MCCS 94-273
-
-
Dunning, T.1
-
12
-
-
0000803388
-
The population frequencies of species and the estimation of population parameters
-
I. J. Good. The population frequencies of species and the estimation of population parameters. Biometrika, 40:237-264, 1953.
-
(1953)
Biometrika
, vol.40
, pp. 237-264
-
-
Good, I.J.1
-
14
-
-
33644534588
-
Language identification for the automatic grapheme-to-phoneme conversion of foreign words in a german text-to-speech system
-
September
-
P. Henrich. Language identification for the automatic grapheme-to-phoneme conversion of foreign words in a german text-to-speech system. In Proceedings of Eurospeech 1989, European Speech Communication and Technology, pages 220-223, September 1989.
-
(1989)
Proceedings of Eurospeech 1989, European Speech Communication and Technology
, pp. 220-223
-
-
Henrich, P.1
-
15
-
-
33644547428
-
Information space based on html structure
-
E. M. Voorhees and D. K. Harman, editors, Department of Commerce of National Institute of Standards and Technology
-
C. Hill. Information space based on html structure. In E. M. Voorhees and D. K. Harman, editors, Proceedings of TREC-9, the 9th Text REtrieval Conference. Department of Commerce of National Institute of Standards and Technology, 2000.
-
(2000)
Proceedings of TREC-9, the 9th Text REtrieval Conference
-
-
Hill, C.1
-
17
-
-
0005180705
-
An information-theoretic definition of similarity
-
Morgan Kaufmann, San Francisco, CA
-
D. Lin. An information-theoretic definition of similarity. In Proceedings of ICML-98, the 15th International Conference on Machine Learning, pages 296-304. Morgan Kaufmann, San Francisco, CA, 1998.
-
(1998)
Proceedings of ICML-98, the 15th International Conference on Machine Learning
, pp. 296-304
-
-
Lin, D.1
-
20
-
-
3843127500
-
Character n-gram tokenization for european language text retrieval
-
April
-
P. McNamee and J. Mayfleld. Character n-gram tokenization for european language text retrieval. Information Retrieval, 7, April 2004.
-
(2004)
Information Retrieval
, vol.7
-
-
McNamee, P.1
Mayfleld, J.2
-
21
-
-
0003268207
-
Performance and scalability of a large-scale n-gram based information retrieval system
-
E. Miller, D. Shen, J. Liu, and C. Nicholas. Performance and scalability of a large-scale n-gram based information retrieval system. Journal of Digital Information, 1(21), 2000.
-
(2000)
Journal of Digital Information
, vol.1
, Issue.21
-
-
Miller, E.1
Shen, D.2
Liu, J.3
Nicholas, C.4
-
23
-
-
1542340317
-
N-gram term weighting: A comparative analysis
-
National Security Agency Technical, January
-
C. Pearce and B. Rye. N-gram term weighting: A comparative analysis. Technical Report TR-R52-001-98, National Security Agency Technical, January 1998.
-
(1998)
Technical Report
, vol.TR-R52-001-98
-
-
Pearce, C.1
Rye, B.2
-
26
-
-
85119187881
-
Natural language identification using corpus-based models
-
C. Souter, G. Churcher, J. Hayes, J. Hughes, and S. Johnson. Natural language identification using corpus-based models. Hermes Journal of Linguistics, 13:183-203, 1994.
-
(1994)
Hermes Journal of Linguistics
, vol.13
, pp. 183-203
-
-
Souter, C.1
Churcher, G.2
Hayes, J.3
Hughes, J.4
Johnson, S.5
|