메뉴 건너뛰기




Volumn 1, Issue 3, 2002, Pages 269-278

A Language and Character Set Determination Method Based on N-gram Statistics

Author keywords

Algorithms; character set; corpus based analysis; Languages; local language site; N gram; natural languages; Text categorization; Unicode

Indexed keywords


EID: 80155181779     PISSN: 15300226     EISSN: 15583430     Source Type: Journal    
DOI: 10.1145/772755.772759     Document Type: Article
Times cited : (33)

References (6)
  • 2
    • 0033894701 scopus 로고    scopus 로고
    • Text categorization using compression models
    • (Snowbird, UT, March 2000). IEEE Computer Society
    • Frank, E., Chui, C. and Witten, I. H. 2000. Text categorization using compression models. In Proceedings of the IEEE Data Compression Conference (Snowbird, UT, March 2000). IEEE Computer Society, 276-288
    • (2000) Proceedings of the IEEE Data Compression Conference , pp. 276-288
    • Frank, E.1    Chui, C.2    Witten, I.H.3
  • 3
    • 0003612818 scopus 로고    scopus 로고
    • Foundations of Statistical Natural Language Processing
    • The MIT Press
    • Manning, C. D. and Schulze, H. 1999. Foundations of Statistical Natural Language Processing. The MIT Press
    • (1999)
    • Manning, C.D.1    Schulze, H.2
  • 4
    • 85025379084 scopus 로고    scopus 로고
    • Natural language determination using correlation between common words
    • U.S. Patent No. 6,023,670
    • Martino, M. J. et al. 2000. Natural language determination using correlation between common words. U.S. Patent No. 6,023,670
    • Martino, M.J.1
  • 5
    • 84874005368 scopus 로고    scopus 로고
    • Natural language determination using partial words
    • U.S. Patent No. 6
    • Martino, M. J. et al. 2001. Natural language determination using partial words. U.S. Patent No. 6, 216, 102
    • Martino, M.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.