메뉴 건너뛰기




Volumn , Issue SPEC. ISS., 2003, Pages 104-110

A Repetition Based Measure for Verification of Text Collections and for Text Categorization

Author keywords

Cross entropy; Language modeling; Text categorization; Text compression

Indexed keywords

ALGORITHMS; ARRAYS; COMPUTATIONAL METHODS; COMPUTER SOFTWARE SELECTION AND EVALUATION; DATA STRUCTURES; FORMAL LANGUAGES; NATURAL LANGUAGE PROCESSING SYSTEMS; STANDARDS;

EID: 1542377547     PISSN: 01635840     EISSN: None     Source Type: Journal    
DOI: 10.1145/860454.860456     Document Type: Conference Paper
Times cited : (69)

References (22)
  • 1
    • 0001868131 scopus 로고    scopus 로고
    • Reducing multiclass to binary: A unifying approach for margin classifiers
    • Morgan Kaufmann, San Francisco, CA
    • E. L. Allwein, R. E. Schapire, and Y. Singer. Reducing multiclass to binary: A unifying approach for margin classifiers. In Proc. 17th International Conf. on Machine Learning, pages 9-16. Morgan Kaufmann, San Francisco, CA, 2000.
    • (2000) Proc. 17th International Conf. on Machine Learning , pp. 9-16
    • Allwein, E.L.1    Schapire, R.E.2    Singer, Y.3
  • 2
    • 0003573193 scopus 로고
    • A block-sorting lossless data compression algorithm
    • Digital SRC, Palo Alto
    • M. Burrows and D. J. Wheeler. A block-sorting lossless data compression algorithm. Technical Report 124, Digital SRC, Palo Alto, 1994.
    • (1994) Technical Report , vol.124
    • Burrows, M.1    Wheeler, D.J.2
  • 3
    • 0008965347 scopus 로고    scopus 로고
    • On the learnability and design of output codes for multiclass problems
    • K. Crammer and Y. Singer. On the learnability and design of output codes for multiclass problems. In Computational Learning Theory, pages 35-46, 2000.
    • (2000) Computational Learning Theory , pp. 35-46
    • Crammer, K.1    Singer, Y.2
  • 4
    • 0142253939 scopus 로고    scopus 로고
    • A theoretical and experimental study on the construction of suffix arrays in external memory
    • A. Crauser and P. Ferragina. A theoretical and experimental study on the construction of suffix arrays in external memory. Algorithmica, 32(1):1-35, 2002.
    • (2002) Algorithmica , vol.32 , Issue.1 , pp. 1-35
    • Crauser, A.1    Ferragina, P.2
  • 6
    • 0034506014 scopus 로고    scopus 로고
    • Opportunistic data structures with applications
    • IEEE Comput. Soc. Press, Los Alamitos, CA
    • P. Ferragina and G. Manzini. Opportunistic data structures with applications. In 41st Ann. Symp. on Found, of Comput. Sc., pages 390-398. IEEE Comput. Soc. Press, Los Alamitos, CA, 2000.
    • (2000) 41st Ann. Symp. on Found, of Comput. Sc. , pp. 390-398
    • Ferragina, P.1    Manzini, G.2
  • 7
    • 0033894701 scopus 로고    scopus 로고
    • Text categorization using compression models
    • Snowbird. US. IEEE Computer Society Press
    • E. Frank, C. Chui, and I. H. Witten. Text categorization using compression models. In Proceedings of DCC-00, IEEE DCC, pages 200-209, Snowbird. US, 2000. IEEE Computer Society Press.
    • (2000) Proceedings of DCC-00, IEEE DCC , pp. 200-209
    • Frank, E.1    Chui, C.2    Witten, I.H.3
  • 9
    • 1542340393 scopus 로고    scopus 로고
    • Disputed Authorship Resolution through Using Relative Empirical Entropy for Markov Chains of Letters in Human Language Text
    • D. Khmelev. Disputed Authorship Resolution through Using Relative Empirical Entropy for Markov Chains of Letters in Human Language Text. J. of Quantitative Linguistics, 7(3):201-207, 2000.
    • (2000) J. of Quantitative Linguistics , vol.7 , Issue.3 , pp. 201-207
    • Khmelev, D.1
  • 10
    • 25944434925 scopus 로고    scopus 로고
    • Verification of text collections for text categorization and natural language processing
    • School of Informatics, Univ. of Wales, Bangor
    • D. V. Khmelev and W. J. Teahan. Verification of text collections for text categorization and natural language processing. Technical Report AIIA 03.1, School of Informatics, Univ. of Wales, Bangor, 2003.
    • (2003) Technical Report , vol.AIIA 03.1
    • Khmelev, D.V.1    Teahan, W.J.2
  • 12
    • 0027681165 scopus 로고
    • Suffix arrays: A new method for on-line string searches
    • U. Manber and G. Myers. Suffix arrays: a new method for on-line string searches. SIAM J. Comput., 22(5):935-948, 1993.
    • (1993) SIAM J. Comput. , vol.22 , Issue.5 , pp. 935-948
    • Manber, U.1    Myers, G.2
  • 13
    • 84948481845 scopus 로고
    • An algorithm for suffix stripping
    • M. Porter. An algorithm for suffix stripping. Program, 14(3):130-137, 1980.
    • (1980) Program , vol.14 , Issue.3 , pp. 130-137
    • Porter, M.1
  • 16
    • 0013254819 scopus 로고    scopus 로고
    • Duplicate detection in the Reuters collection
    • Department of Computer Science, Univ. of Glasgow
    • M. Sanderson. Duplicate detection in the Reuters collection. Technical report. Department of Computer Science, Univ. of Glasgow, 1997.
    • (1997) Technical Report
    • Sanderson, M.1
  • 17
    • 84856043672 scopus 로고
    • A mathematical theory of communication
    • 1948
    • C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379-423,623-656,1948,1948.
    • (1948) Bell System Technical Journal , vol.27 , pp. 379-423
    • Shannon, C.E.1
  • 18
    • 0035769083 scopus 로고    scopus 로고
    • Improving the efficiency of the PPM algorithm
    • D. Shkarin. Improving the efficiency of the PPM algorithm. Problems of Information Transmission. 37(3):226-235, 2001.
    • (2001) Problems of Information Transmission , vol.37 , Issue.3 , pp. 226-235
    • Shkarin, D.1
  • 19
    • 1542310280 scopus 로고    scopus 로고
    • Text classification and segmentation using minimum cross-entropy
    • Paris, France
    • W. J. Teahan. Text classification and segmentation using minimum cross-entropy. In Proc. RIAO'2000, volume 2, pages 943-961, Paris, France, 2000.
    • (2000) Proc. RIAO'2000 , vol.2 , pp. 943-961
    • Teahan, W.J.1
  • 20
    • 1242280857 scopus 로고    scopus 로고
    • Using compression- based language models for text categorization
    • Carnegie Mellon Univ., May
    • W. J. Teahan and D. J. Harper. Using compression- based language models for text categorization. In Workshop on Lang. Modeling and Inform. Retrieval, pages 83-88. Carnegie Mellon Univ., May 2001.
    • (2001) Workshop on Lang. Modeling and Inform. Retrieval , pp. 83-88
    • Teahan, W.J.1    Harper, D.J.2
  • 21


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.