메뉴 건너뛰기




Volumn 6815, Issue , 2008, Pages

Measuring the impact of character recognition errors on downstream text analysis

Author keywords

Error classification; Natural language processing; Optical character recognition; Text analysis

Indexed keywords

ERROR ANALYSIS; NATURAL LANGUAGE PROCESSING SYSTEMS; TEXT PROCESSING;

EID: 41149164144     PISSN: 0277786X     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1117/12.767131     Document Type: Conference Paper
Times cited : (5)

References (21)
  • 4
    • 0030151440 scopus 로고    scopus 로고
    • Effects of OCR errors on ranking and feedback using the vector space model
    • K. Taghva, J. Borsack, and A. Condit, "Effects of OCR errors on ranking and feedback using the vector space model," Information Processing and Management 32(3), pp. 317-327, 1996.
    • (1996) Information Processing and Management , vol.32 , Issue.3 , pp. 317-327
    • Taghva, K.1    Borsack, J.2    Condit, A.3
  • 5
    • 0002849652 scopus 로고    scopus 로고
    • Evaluation of model-based retrieval effectiveness with OCR text
    • January
    • K. Taghva, J. Borsack, and A. Condit, "Evaluation of model-based retrieval effectiveness with OCR text," ACM Transactions on Information Systems 14, pp. 64-93, January 1996.
    • (1996) ACM Transactions on Information Systems , vol.14 , pp. 64-93
    • Taghva, K.1    Borsack, J.2    Condit, A.3
  • 9
    • 41149113389 scopus 로고    scopus 로고
    • Tesseract open source OCR engine, November 2007
    • "Tesseract open source OCR engine," November 2007. http://sourceforge.net/projects/tesseract-ocr.
  • 16
    • 41149174659 scopus 로고    scopus 로고
    • Project Gutenberg, November 2007
    • "Project Gutenberg," November 2007. http://www.gutenberg.net/.
  • 18
    • 84886884351 scopus 로고    scopus 로고
    • Treebanks gone bad: Generating a treebank of ungrammatical English
    • Hyderabad, India, January 2007
    • J. Foster, "Treebanks gone bad: Generating a treebank of ungrammatical English," in Proceedings of the Workshop on Analytics for Noisy Unstructured Text Data, (Hyderabad, India), January 2007. http://research.ihost.com/and2007/cd/Proceedings.files/p39.pdf.
    • Proceedings of the Workshop on Analytics for Noisy Unstructured Text Data
    • Foster, J.1
  • 20
    • 33644548762 scopus 로고    scopus 로고
    • Quality assessment and restoration of typewritten document images,
    • 99-1233, Los Alamos National Laboratory
    • M. Cannon, J. Hochberg, and P. Kelly, "Quality assessment and restoration of typewritten document images," Tech. Rep. LA-UR 99-1233, Los Alamos National Laboratory, 1999.
    • (1999) Tech. Rep. LA-UR
    • Cannon, M.1    Hochberg, J.2    Kelly, P.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.