메뉴 건너뛰기




Volumn , Issue , 2009, Pages 115-122

A survey of types of text noise and techniques to handle noisy text

Author keywords

Information extraction; Information retrieval; Natural language processing; Noisy text; Text mining

Indexed keywords

APPLICATION AREA; AUTOMATIC SPEECH RECOGNITION; COMPUTER PROCESSING; DIGITAL TEXT; HUMAN USE; INFORMATION EXTRACTION; MACHINE TRANSLATIONS; MESSAGE BOARDS; NATURAL LANGUAGE PROCESSING; NEWSGROUPS; PROCESSING SIGNAL; REAL-WORLD NOISE; TEXT DOCUMENT; TEXT MINING; WEB PAGE;

EID: 70450191125     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1568296.1568315     Document Type: Conference Paper
Times cited : (72)

References (35)
  • 5
    • 70349458373 scopus 로고    scopus 로고
    • Confidence metrics based on n-gram language model backoff behaviors
    • U. Berdy, C. Uhrik, and W. Ward. Confidence metrics based on n-gram language model backoff behaviors. In Proc. EUROSPEECH, pages 2771-2774, 1997.
    • (1997) Proc. EUROSPEECH , pp. 2771-2774
    • Berdy, U.1    Uhrik, C.2    Ward, W.3
  • 6
    • 85044611587 scopus 로고
    • The mathematics of statistical machine translation: Parameter estimation
    • P. F. Brown, V. J. Pietra, S. A. D. Pietra, and R. L. Mercer. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19:263-311, 1993.
    • (1993) Computational Linguistics , vol.19 , pp. 263-311
    • Brown, P.F.1    Pietra, V.J.2    Pietra, S.A.D.3    Mercer, R.L.4
  • 12
    • 0141702347 scopus 로고    scopus 로고
    • Optimizing svms for complex call classification
    • P. Haffner, G. Tur, and J. Wright. Optimizing svms for complex call classification. In Proc. of ICASSP, 2003.
    • (2003) Proc. of ICASSP
    • Haffner, P.1    Tur, G.2    Wright, J.3
  • 16
    • 0031619914 scopus 로고    scopus 로고
    • Rejection of out-of-vocabulary words using phoneme confidence likelihood
    • T. Jitsuhiro, S. Takahashi, and K. Aikawa. Rejection of out-of-vocabulary words using phoneme confidence likelihood. In Proc. ICASSP98, pages 217-220, 1998.
    • (1998) Proc. ICASSP98 , pp. 217-220
    • Jitsuhiro, T.1    Takahashi, S.2    Aikawa, K.3
  • 17
    • 85135146711 scopus 로고    scopus 로고
    • Estimating confidence using word lattices
    • T. Kemp and T. Schaaf. Estimating confidence using word lattices. In in Proceedings of EuroSpeech, pages 827-830, 1997.
    • (1997) In Proceedings of EuroSpeech , pp. 827-830
    • Kemp, T.1    Schaaf, T.2
  • 18
    • 25144482876 scopus 로고    scopus 로고
    • Automatic filtering of bilingual corpora for statistical machine translation
    • S. Khadivi and H. Ney. Automatic filtering of bilingual corpora for statistical machine translation. In NLDB, pages 263-274, 2005.
    • (2005) NLDB , pp. 263-274
    • Khadivi, S.1    Ney, H.2
  • 21
    • 0026979939 scopus 로고
    • Technique for automatically correcting words in text
    • K. Kukich. Technique for automatically correcting words in text. ACM Comput. Surv., 24(4):377-439, 1992.
    • (1992) ACM Comput. Surv. , vol.24 , Issue.4 , pp. 377-439
    • Kukich, K.1
  • 22
    • 84943654381 scopus 로고    scopus 로고
    • Patterns of search: Analyzing and modeling web query refinement
    • Secaucus, NJ, USA, Springer-Verlag New York, Inc
    • T. Lau and E. Horvitz. Patterns of search: analyzing and modeling web query refinement. In UM '99: Proceedings of the seventh international conference on User modeling, pages 119-128, Secaucus, NJ, USA, 1999. Springer-Verlag New York, Inc.
    • (1999) UM '99: Proceedings of the Seventh International Conference on User Modeling , pp. 119-128
    • Lau, T.1    Horvitz, E.2
  • 34
    • 0030151440 scopus 로고    scopus 로고
    • Effects of ocr errors on ranking and feedback using the vector space model
    • K. Taghva, J. Borsack, and A. Condit. Effects of ocr errors on ranking and feedback using the vector space model. Information Processing and Management, 32(3):317-327, 1996.
    • (1996) Information Processing and Management , vol.32 , Issue.3 , pp. 317-327
    • Taghva, K.1    Borsack, J.2    Condit, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.