메뉴 건너뛰기




Volumn , Issue , 2009, Pages 47-55

Efficient record-level wrapper induction

Author keywords

Broom representation; Information extraction; Web record; Wrapper

Indexed keywords

DIFFERENT DOMAINS; EXTRACTION ACCURACY; INFORMATION EXTRACTION; INFORMATION NEED; ONLINE INFORMATION SYSTEMS; PERSONAL PROFILE; SEMANTIC STRUCTURES; SOCIAL UTILITY; WEB INFORMATION; WEB RECORD; WEB-PAGE; WRAPPER INDUCTION; WRAPPER SYSTEM;

EID: 74549208580     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1645953.1645962     Document Type: Conference Paper
Times cited : (39)

References (23)
  • 1
    • 74549119410 scopus 로고    scopus 로고
    • http://www.w3.org/dom/.
  • 2
    • 1142303684 scopus 로고    scopus 로고
    • Extracting structured data from web
    • A. Arasu and H. Garcia-Molina. Extracting structured data from web pages. In SIGMOD, pages 337 - 348.
    • SIGMOD , pp. 337-348
    • Arasu, A.1    Garcia-Molina, H.2
  • 3
    • 85042021254 scopus 로고    scopus 로고
    • Iepad: Information extraction based on pattern discovery
    • C.-H. Chang and S.-C. Lui. Iepad: information extraction based on pattern discovery. In WWW-2001, pages 681-688.
    • (2001) , pp. 681-688
    • Chang, C.-H.1    Lui, S.-C.2
  • 5
    • 77953046656 scopus 로고    scopus 로고
    • A flexible learning system for wrapping tables and lists in html documents
    • W. W. Cohen, M. Hurst, and L. S. Jensen. A flexible learning system for wrapping tables and lists in html documents. In WWW-2002, pages 232 - 241.
    • (2002) , pp. 232-241
    • Cohen, W.W.1    Hurst, M.2    Jensen, L.S.3
  • 6
    • 84944327150 scopus 로고    scopus 로고
    • Roadrunner: Towards automatic data extraction from large web sites
    • V. Crescenzi, G. Mecca, and P. Merialdo. Roadrunner: Towards automatic data extraction from large web sites. In VLDB-2001, pages 109 - 118.
    • (2001) VLDB , pp. 109-118
    • Crescenzi, V.1    Mecca, G.2    Merialdo, P.3
  • 8
    • 30544447615 scopus 로고    scopus 로고
    • Thresher: Automating the unwrapping of semantic content from the world wide web
    • A. Hogue and D. Karger. Thresher: automating the unwrapping of semantic content from the world wide web. In WWW-2005, pages 86 - 95.
    • (2005) , pp. 86-95
    • Hogue, A.1    Karger, D.2
  • 9
    • 0032309862 scopus 로고    scopus 로고
    • Generating finite-state transducers for semi-structured data extraction from the web
    • C.-N. Hsu and M.-T. Dung. Generating finite-state transducers for semi-structured data extraction from the web. Information Systems, Special Issue on Semistructured Data, 23(8):521-538, 1998.
    • (1998) Information Systems, Special Issue on Semistructured Data , vol.23 , Issue.8 , pp. 521-538
    • Hsu, C.-N.1    Dung, M.-T.2
  • 10
    • 0001776223 scopus 로고    scopus 로고
    • Wrapper induction for information extraction
    • N. Kushmerick, D. S. Weld, and R. B. Doorenbos. Wrapper induction for information extraction. In IJCAI-1997, pages 729-737, 1997.
    • (1997) IJCAI-1997 , pp. 729-737
    • Kushmerick, N.1    Weld, D.S.2    Doorenbos, R.B.3
  • 13
    • 77952333945 scopus 로고    scopus 로고
    • Mining data records in web
    • B. Liu, R. Grossman, and Y. Zhai. Mining data records in web pages. In SIGKDD-2003, pages 601 - 606.
    • (2003) SIGKDD , pp. 601-606
    • Liu, B.1    Grossman, R.2    Zhai, Y.3
  • 14
    • 0033893885 scopus 로고    scopus 로고
    • Xwrap: An xml-enabled wrapper construction system for web information sources
    • L. Liu, C. Pu, and W. Han. Xwrap: an xml-enabled wrapper construction system for web information sources. In ICDE-2000, pages 611-621, 2000.
    • (2000) ICDE-2000 , pp. 611-621
    • Liu, L.1    Pu, C.2    Han, W.3
  • 16
    • 38549134414 scopus 로고    scopus 로고
    • Object-level vertical search
    • Z. Nie, J.-R. Wen, and W.-Y. Ma. Object-level vertical search. In CIDR-2007, pages 235-246.
    • (2007) CIDR , pp. 235-246
    • Nie, Z.1    Wen, J.-R.2    Ma, W.-Y.3
  • 17
    • 4644340823 scopus 로고    scopus 로고
    • Automatic web news extraction using tree edit distance
    • D. C. Reis, P. B. Golgher, A. S. Silva, and A. F. Laender. Automatic web news extraction using tree edit distance. In WWW-2004, pages 502 - 511.
    • (2004) , pp. 502-511
    • Reis, D.C.1    Golgher, P.B.2    Silva, A.S.3    Laender, A.F.4
  • 18
    • 74549194383 scopus 로고    scopus 로고
    • Automation in information extraction and data integration (tutorial)
    • S. Sarawagi. Automation in information extraction and data integration (tutorial). In VLDB-2002.
    • VLDB-2002
    • Sarawagi, S.1
  • 19
    • 84880476173 scopus 로고    scopus 로고
    • Data extraction and label assignment for web databases
    • J. Wang and F. H. Lochovsky. Data extraction and label assignment for web databases. In WWW-2003, pages 187 - 196.
    • (2003) , pp. 187-196
    • Wang, J.1    Lochovsky, F.H.2
  • 20
    • 3543147086 scopus 로고
    • Recent trends in hierarchic document clustering: A critical review
    • P. Willett. Recent trends in hierarchic document clustering: a critical review. Information Processing and Management, 24(5):577-597, 1988.
    • (1988) Information Processing and Management , vol.24 , Issue.5 , pp. 577-597
    • Willett, P.1
  • 21
    • 33744511796 scopus 로고    scopus 로고
    • Fully automatic wrapper generation for search engines
    • H. Zhao, W. Meng, Z. Wu, V. Raghavan, and C. Yu. Fully automatic wrapper generation for search engines. In WWW-2005, pages 66 - 75.
    • (2005) , pp. 66-75
    • Zhao, H.1    Meng, W.2    Wu, Z.3    Raghavan, V.4    Yu, C.5
  • 22
    • 65449126734 scopus 로고    scopus 로고
    • Pictor: An interactive system for importing data from a website
    • S. Zheng, M. R. Scott, R. Song, and J.-R. Wen. Pictor: An interactive system for importing data from a website. In SIGKDD-2008, 2008.
    • (2008) SIGKDD , pp. 2008
    • Zheng, S.1    Scott, M.R.2    Song, R.3    Wen, J.-R.4
  • 23
    • 36849062139 scopus 로고    scopus 로고
    • Joint optimization of wrapper generation and template detection
    • S. Zheng, R. Song, D. Wu, and J.-R. Wen. Joint optimization of wrapper generation and template detection. In SIGKDD-2007.
    • SIGKDD-2007
    • Zheng, S.1    Song, R.2    Wu, D.3    Wen, J.-R.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.