메뉴 건너뛰기




Volumn 52, Issue 1, 2012, Pages 51-62

Improved chemical text mining of patents with infinite dictionaries and automatic spelling correction

Author keywords

[No Author keywords available]

Indexed keywords

NATURAL LANGUAGE PROCESSING SYSTEMS; ONTOLOGY; PATENTS AND INVENTIONS;

EID: 84858040414     PISSN: 15499596     EISSN: 1549960X     Source Type: Journal    
DOI: 10.1021/ci200463r     Document Type: Article
Times cited : (19)

References (56)
  • 1
    • 84858062048 scopus 로고    scopus 로고
    • accessed November 11
    • CAS Registry Numbers; http://www.cas.org/index.html (accessed November 11, 2011).
    • (2011) CAS Registry Numbers
  • 2
    • 77955164979 scopus 로고    scopus 로고
    • accessed November 11
    • Thomson Reuters; http://thomsonreuters.com/products-services/science/ science-products/scientific-research/drug-discovery-development/ chemistry-research/ (accessed November 11, 2011).
    • (2011) Thomson Reuters
  • 4
    • 78049444157 scopus 로고    scopus 로고
    • The cinderella of biological data integration: Addressing some of the challenges of entity and relationship mining from patent sources
    • Lambrix, P., Kemp, G., Eds.; Springer: Berlin, Heidelberg, Germany
    • Suriyawongkul, I.; Southan, C.; Muresan, S. The Cinderella of Biological Data Integration: Addressing Some of the Challenges of Entity and Relationship Mining from Patent Sources. In Data Integration in the Life Sciences; Lambrix, P., Kemp, G., Eds.; Springer: Berlin, Heidelberg, Germany, 2010; pp 106-121.
    • (2010) Data Integration in the Life Sciences , pp. 106-121
    • Suriyawongkul, I.1    Southan, C.2    Muresan, S.3
  • 7
    • 0003955767 scopus 로고
    • International Union of Pure and Applied Chemists (IUPAC); Blackwell Scientific Publications: London, U.K
    • International Union of Pure and Applied Chemists (IUPAC). A Guide to IUPAC Nomenclature of Organic Compounds, Recommendations 1993; Blackwell Scientific Publications: London, U.K., 1993.
    • (1993) A Guide to IUPAC Nomenclature of Organic Compounds, Recommendations 1993
  • 8
    • 84858043570 scopus 로고    scopus 로고
    • Naming and Indexing of Chemical Substances for Chemical Abstracts. (accessed November 11
    • Naming and Indexing of Chemical Substances for Chemical Abstracts. Appendix IV of CA Index Guide, Chemical Abstracts Service ( C A S ) ; http://www.cas.org/ASSETS/58D34DD3892142D18F5C3B0A004D3A0C/indexguideapp.pdf (accessed November 11, 2011).
    • (2011) Appendix IV of CA Index Guide, Chemical Abstracts Service ( C A S
  • 11
    • 0023965741 scopus 로고
    • Smiles, a chemical language and information system. 1. Introduction to methodology and encoding rules
    • Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31-36. (Pubitemid 18574254)
    • (1988) Journal of Chemical Information and Computer Sciences , vol.28 , Issue.1 , pp. 31-36
    • Weininger, D.1
  • 19
    • 11344294871 scopus 로고    scopus 로고
    • Extraction of information from the text of chemical patents. 1. Identification of specific chemical names
    • Kemp, N.; Lynch, M. Extraction of Information from the Text of Chemical Patents. 1. Identification of Specific Chemical Names. J. Chem. Inf. Comput. Sci. 1998, 38, 544-551. (Pubitemid 128571182)
    • (1998) Journal of Chemical Information and Computer Sciences , vol.38 , Issue.4 , pp. 544-551
    • Kemp, N.1    Lynch, M.2
  • 21
    • 84858028925 scopus 로고    scopus 로고
    • version 2.0.2; OpenEye Scientific Software: Santa Fe, NM; (accessed November 11
    • Lexichem, version 2.0.2; OpenEye Scientific Software: Santa Fe, NM; http://www.eyesopen.com/lexichem-tk (accessed November 11, 2011).
    • (2011) Lexichem
  • 22
    • 79953164876 scopus 로고    scopus 로고
    • CambridgeSoft, Cambridge MA (accessed November 11
    • Struct=Name; CambridgeSoft, Cambridge, MA;http://www. cambridgesoft.com/software/details/?ds=5 (accessed November 11, 2011).
    • (2011) Struct=Name
  • 23
    • 84858047529 scopus 로고    scopus 로고
    • ACD/Labs: Toronto, Canada; (accessed November 11
    • ACD/Name; ACD/Labs: Toronto, Canada; http://www. acdlabs.com/products/ draw-nom/nom/name/ (accessed November 11, 2011).
    • (2011) ACD/Name
  • 24
    • 84858043573 scopus 로고    scopus 로고
    • ChemAxon: Budapest, Hungary; (accessed November 11
    • Name <> Structure; ChemAxon: Budapest, Hungary; http:// www.chemaxon.com/products/name-to-structure/ (accessed November 11, 2011).
    • (2011) Name <> Structure
  • 25
    • 84858062696 scopus 로고    scopus 로고
    • Bio-Rad Laboratories: Irvine CA; (accessed November 11
    • IUPAC NameIt and DrawIt; Bio-Rad Laboratories: Irvine, CA; http://www3.bio-rad.com/pages/SAD/docs/95925-IUPAC-DS. pdf#zoom=75% (accessed November 11, 2011).
    • (2011) IUPAC NameIt and DrawIt
  • 26
    • 84858026169 scopus 로고    scopus 로고
    • ChemInnovation Software Inc.: San Diego, CA; (accessed November 11
    • NameExpert; ChemInnovation Software Inc.: San Diego, CA; http://www.cheminnovation.com/products/nameexpert.asp (accessed November 11, 2011).
    • (2011) NameExpert
  • 27
    • 84858043571 scopus 로고    scopus 로고
    • InfoChem: Munich Germany; (accessed November 11
    • ICN2S; InfoChem: Munich, Germany; http://infochem.de/ mining/icn2s.shtml (accessed November 11, 2011).
    • (2011) ICN2S
  • 28
    • 84858009477 scopus 로고    scopus 로고
    • NCI: Bethesda, MD; (accessed November 11
    • NCI Open Database Compounds; NCI: Bethesda, MD; http:// cactus.nci.nih.gov/download/nci/ (accessed November 11, 2011).
    • (2011) NCI Open Database Compounds
  • 29
    • 84858062695 scopus 로고    scopus 로고
    • Key Organics: London U.K.; (accessed November 11
    • KeyOrganics; Key Organics: London, U.K.; http://www. keyorganics.co.uk/ Downloads (accessed November 11, 2011).
    • (2011) KeyOrganics
  • 30
    • 84858009530 scopus 로고    scopus 로고
    • Thermo Fisher Scientific: Waltham, MA; (accessed November 11
    • Maybridge; Thermo Fisher Scientific: Waltham, MA; http:// www.maybridge.com (accessed November 11, 2011).
    • (2011) Maybridge
  • 31
    • 0026979939 scopus 로고
    • Techniques for automatically correcting words in text
    • DOI 10.1145/146370.146380
    • Kukich, K. Techniques for Automatically Correcting Words in Text. ACM Comput. Surv. 1992, 24, 377-439. (Pubitemid 23687641)
    • (1992) ACM Computing Surveys , vol.24 , Issue.4 , pp. 377-439
    • Kukich Karen1
  • 34
    • 84976776121 scopus 로고
    • Spelling correction in scientific and scholarly text
    • Pollock, J. J.; Zamora, A. Spelling correction in scientific and scholarly text. Commun. ACM 1984, 27, 358-368.
    • (1984) Commun. ACM , vol.27 , pp. 358-368
    • Pollock, J.J.1    Zamora, A.2
  • 35
    • 0042550988 scopus 로고
    • Computer translation of IUPAC systematic organic chemical nomenclature. 6. (Semi)automatic name correction
    • Kirby, G. H.; Lord, M. R.; Rayner, J. D. Computer translation of IUPAC systematic organic chemical nomenclature. 6. (Semi)automatic name correction. J. Chem. Inf. Comput. Sci. 1991, 31, 153-160.
    • (1991) J. Chem. Inf. Comput. Sci. , vol.31 , pp. 153-160
    • Kirby, G.H.1    Lord, M.R.2    Rayner, J.D.3
  • 39
    • 84945709825 scopus 로고
    • Trie memory
    • Fredkin, E. Trie memory. Commun. ACM 1960, 3, 490-499.
    • (1960) Commun. ACM , vol.3 , pp. 490-499
    • Fredkin, E.1
  • 40
    • 84974678647 scopus 로고    scopus 로고
    • A taxonomy of algorithms for constructing minimal acyclic deterministic finite automata
    • Automata Implementation
    • Watson, B. A Taxonomy of Algorithms for Constructing Minimal Acyclic Deterministic Finite Automata. In Automata Implementation; Boldt, O., Jürgensen, H., Eds.; Springer: Berlin, Heidelberg, Germany, 2001; pp 174-182. (Pubitemid 33359655)
    • (2001) Lecture Notes in Computer Science , Issue.2214 , pp. 174-182
    • Watson, B.W.1
  • 43
    • 0004449398 scopus 로고
    • Three models for the description of language
    • Chomsky, N. Three models for the description of language. IRE Trans. Inf. Theory 1956, 2, 113-124.
    • (1956) IRE Trans. Inf. Theory , vol.2 , pp. 113-124
    • Chomsky, N.1
  • 44
    • 0002692959 scopus 로고    scopus 로고
    • Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction
    • Oflazer, K. Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction. Comput. Linguist. 1996, 22, 73-89.
    • (1996) Comput. Linguist. , vol.22 , pp. 73-89
    • Oflazer, K.1
  • 45
    • 84941869105 scopus 로고
    • A technique for computer detection and correction of spelling errors
    • Damerau, F. J. A technique for computer detection and correction of spelling errors. Commun. ACM 1964, 7, 171-176.
    • (1964) Commun. ACM , vol.7 , pp. 171-176
    • Damerau, F.J.1
  • 46
    • 84899829959 scopus 로고
    • A formal basis for the heuristic determination of minimum cost paths
    • Hart, P. E.; Nilsson, N. J.; Raphael, B. A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 100-107.
    • (1968) IEEE Trans. Syst. Sci. Cybern. , vol.4 , pp. 100-107
    • Hart, P.E.1    Nilsson, N.J.2    Raphael, B.3
  • 47
    • 0345566149 scopus 로고    scopus 로고
    • A guided tour to approximate string matching
    • Navarro, G. A guided tour to approximate string matching. ACM Computing Surveys (CSUR) 2001, 33, 31-88. (Pubitemid 33768480)
    • (2001) ACM Computing Surveys , vol.33 , Issue.1 , pp. 31-88
    • Navarro, G.1
  • 48
    • 0026458378 scopus 로고
    • Amino acid substitution matrices from protein blocks
    • Henikoff, S.; Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. U.S.A. 1992, 89, 10915-10919.
    • (1992) Proc. Natl. Acad. Sci. U.S.A. , vol.89 , pp. 10915-10919
    • Henikoff, S.1    Henikoff, J.G.2
  • 49
    • 84858041591 scopus 로고    scopus 로고
    • accessed November 11
    • International Nonproprietary Names (INN); http://www.who. int/medicines/services/inn/en/ (accessed November 11, 2011).
    • (2011) International Nonproprietary Names (INN
  • 50
    • 84858023424 scopus 로고    scopus 로고
    • (accessed November 11
    • United States Adopted Names; http://www.ama-assn.org/ama/ pub/physician-resources/medical-science/united-states-adoptednames-council.page (accessed November 11, 2011).
    • (2011) United States Adopted Names
  • 51
    • 84864127959 scopus 로고    scopus 로고
    • accessed November 11
    • British Approved Names; http://www.pharmacopoeia.gov.uk/ publications/british-approved-names.php (accessed November 11, 2011).
    • (2011) British Approved Names
  • 52
    • 65249104941 scopus 로고    scopus 로고
    • Foreign language translation of chemical nomenclature by computer
    • Sayle, R. Foreign Language Translation of Chemical Nomenclature by Computer. J. Chem. Inf. Model. 2009, 49, 519-530.
    • (2009) J. Chem. Inf. Model. , vol.49 , pp. 519-530
    • Sayle, R.1
  • 54
    • 84858062700 scopus 로고    scopus 로고
    • ChemAxon's European User Group Meeting, May 17-18, Budapest, Hungary, accessed November 11, 2011
    • Muresan, S. Automated spelling correction to improve recall rates of name-to-structure tools for chemical text mining, ChemAxon's European User Group Meeting, May 17-18, 2011, Budapest, Hungary; http://www.chemaxon.com/library/ automated-spellingcorrection-to-improve-recall-rates-of-name-to-structure-tools- forchemical-text-mining/ (accessed November 11, 2011).
    • (2011) Automated Spelling Correction to Improve Recall Rates of Name-to-Structure Tools for Chemical Text Mining
    • Muresan, S.1
  • 55
    • 39449105778 scopus 로고    scopus 로고
    • Predicting key example compounds in competitors' patent applications using structural information alone
    • DOI 10.1021/ci7002686
    • Hattori, K.; Wakabayashi, H.; Tamaki, K. Predicting key example compounds in competitors' patent applications using structural information alone. J. Chem. Inf. Model. 2008, 48, 135-142. (Pubitemid 351271058)
    • (2008) Journal of Chemical Information and Modeling , vol.48 , Issue.1 , pp. 135-142
    • Hattori, K.1    Wakabayashi, H.2    Tamaki, K.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.