메뉴 건너뛰기




Volumn 12, Issue SUPPL. 3, 2011, Pages

Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm

Author keywords

[No Author keywords available]

Indexed keywords

BIOMEDICAL DOMAIN; CLASSIFICATION APPROACH; CLASSIFICATION TASKS; DESIGN PATTERNS; LANGUAGE PROCESSING; MACHINE LEARNING APPROACHES; REGULAR EXPRESSIONS; RULE-BASED TECHNIQUES;

EID: 79958107612     PISSN: None     EISSN: 14712105     Source Type: Journal    
DOI: 10.1186/1471-2105-12-S3-S1     Document Type: Article
Times cited : (17)

References (28)
  • 1
    • 0003847769 scopus 로고    scopus 로고
    • Speech and Language Processing
    • Prentice Hall
    • Jurafsky D, Martin JH. Speech and Language Processing. 2009, Prentice Hall.
    • (2009)
    • Jurafsky, D.1    Martin, J.H.2
  • 2
    • 0347109463 scopus 로고
    • Tokenization as the initial phase in NLP
    • Morristown, NJ, USA: Association for Computational Linguistics
    • Webster JJ, Kit C. Tokenization as the initial phase in NLP. Proceedings of the 14th conference on Computational linguistics 1992, 1106-1110. Morristown, NJ, USA: Association for Computational Linguistics.
    • (1992) Proceedings of the 14th conference on Computational linguistics , pp. 1106-1110
    • Webster, J.J.1    Kit, C.2
  • 4
    • 34848845892 scopus 로고    scopus 로고
    • An empirical study of tokenization strategies for biomedical information retrieval
    • Jiang J, Zhai C. An empirical study of tokenization strategies for biomedical information retrieval. Inf. Retr. 2007, 10(4-5):341-363.
    • (2007) Inf. Retr. , vol.10 , Issue.4-5 , pp. 341-363
    • Jiang, J.1    Zhai, C.2
  • 6
    • 78651527746 scopus 로고    scopus 로고
    • A preliminary look into the use of named entity information for bioscience text tokenization
    • HLT-NAACL '04, Morristown, NJ, USA: Association for Computational Linguistics
    • Arens R. A preliminary look into the use of named entity information for bioscience text tokenization. Proceedings of the Student Research Workshop at HLT-NAACL 2004 on XX 2004, 37-42. HLT-NAACL '04, Morristown, NJ, USA: Association for Computational Linguistics., http://portal.acm.org/citation.cfm?id=1614038.1614045
    • (2004) Proceedings of the Student Research Workshop at HLT-NAACL 2004 on XX , pp. 37-42
    • Arens, R.1
  • 7
    • 78651413290 scopus 로고    scopus 로고
    • A Comparison of 13 Tokenizers on MEDLINE
    • The Lister Hill National Center for Biomedical Communications
    • He Y, Kayaalp M. A Comparison of 13 Tokenizers on MEDLINE. Tech. Rep. LHNCBC-TR-2006-003 2006, The Lister Hill National Center for Biomedical Communications.
    • (2006) Tech. Rep. LHNCBC-TR-2006-003
    • He, Y.1    Kayaalp, M.2
  • 11
    • 36148995526 scopus 로고    scopus 로고
    • Text preparation through extended tokenization
    • WIT Press/Computational Mechanics Publications, Zanasi, A and Brebbia, CA and Ebecken, NFF
    • Hassler M, Fliedl G. Text preparation through extended tokenization. Data Mining VII: Data, Text and Web Mining and Their Business Applications 2006, 37:13-21. WIT Press/Computational Mechanics Publications, Zanasi, A and Brebbia, CA and Ebecken, NFF.
    • (2006) Data Mining VII: Data, Text and Web Mining and Their Business Applications , vol.37 , pp. 13-21
    • Hassler, M.1    Fliedl, G.2
  • 13
    • 56149091247 scopus 로고    scopus 로고
    • An unsupervised machine learning approach to segmentation of clinician-entered free text
    • 2655800, 18693949
    • Wrenn JO, Stetson PD, Johnson SB. An unsupervised machine learning approach to segmentation of clinician-entered free text. AMIA Annu Symp Proc 2007, 811-5. 2655800, 18693949.
    • (2007) AMIA Annu Symp Proc , pp. 811-815
    • Wrenn, J.O.1    Stetson, P.D.2    Johnson, S.B.3
  • 15
    • 50049112346 scopus 로고    scopus 로고
    • Pattern Oriented Software Architecture: On Patterns and Pattern Languages
    • John Wiley & Sons
    • Buschmann F, Henney K, Schmidt DC. Pattern Oriented Software Architecture: On Patterns and Pattern Languages. 2007, John Wiley & Sons.
    • (2007)
    • Buschmann, F.1    Henney, K.2    Schmidt, D.C.3
  • 16
    • 33845487544 scopus 로고    scopus 로고
    • Unsupervised multilingual sentence boundary detection
    • Kiss T, Strunk J. Unsupervised multilingual sentence boundary detection. Computational Linguistics 2006, 32(4):485-525.
    • (2006) Computational Linguistics , vol.32 , Issue.4 , pp. 485-525
    • Kiss, T.1    Strunk, J.2
  • 17
    • 34249852033 scopus 로고
    • Building a large annotated corpus of English: the penn treebank
    • Marcus MP, Marcinkiewicz MA, Santorini B. Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 1993, 19(2):313-330.
    • (1993) Comput. Linguist. , vol.19 , Issue.2 , pp. 313-330
    • Marcus, M.P.1    Marcinkiewicz, M.A.2    Santorini, B.3
  • 19
    • 0004232136 scopus 로고    scopus 로고
    • Introduction to Lattices and Order
    • Cambridge University Press, 2
    • Davey BA, Priestley HA. Introduction to Lattices and Order. 2002, Cambridge University Press, 2.
    • (2002)
    • Davey, B.A.1    Priestley, H.A.2
  • 20
    • 0345877442 scopus 로고    scopus 로고
    • Critical tokenization and its properties
    • Guo J. Critical tokenization and its properties. Comput. Linguist. 1997, 23(4):569-596.
    • (1997) Comput. Linguist. , vol.23 , Issue.4 , pp. 569-596
    • Guo, J.1
  • 22
    • 79952365971 scopus 로고    scopus 로고
    • SNOMED Clinical Terms - User Guide
    • The International Health Terminology Standards Development Organisation
    • The International Health Terminology Standards Development Organisation SNOMED Clinical Terms - User Guide. 2009, The International Health Terminology Standards Development Organisation.
    • (2009)
  • 24
    • 84888282864 scopus 로고    scopus 로고
    • 2011, http://lexsrv3.nlm.nih.gov/LexSysGroup/Projects/textTools/current/index.html
    • (2011)
  • 25
    • 5044241106 scopus 로고    scopus 로고
    • MedPost: a part-of-speech tagger for bioMedical text
    • 10.1093/bioinformatics/bth227, 15073016
    • Smith L, Rindflesch T, Wilbur WJ. MedPost: a part-of-speech tagger for bioMedical text. Bioinformatics 2004, 20(14):2320-1. 10.1093/bioinformatics/bth227, 15073016.
    • (2004) Bioinformatics , vol.20 , Issue.14 , pp. 2320-2321
    • Smith, L.1    Rindflesch, T.2    Wilbur, W.J.3
  • 27
    • 0003993831 scopus 로고
    • Statistics for the Engineering and Computer Sciences
    • Dellen Publishing Company
    • Mendenhall W, Sincich T. Statistics for the Engineering and Computer Sciences. 1984, Dellen Publishing Company.
    • (1984)
    • Mendenhall, W.1    Sincich, T.2
  • 28
    • 0015482049 scopus 로고
    • On the criteria to be used in decomposing systems into modules
    • Parnas DL. On the criteria to be used in decomposing systems into modules. Commun. ACM 1972, 15(12):1053-1058.
    • (1972) Commun. ACM , vol.15 , Issue.12 , pp. 1053-1058
    • Parnas, D.L.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.