메뉴 건너뛰기




Volumn 3, Issue 1, 2010, Pages 578-587

Exploiting content redundancy for web information extraction

Author keywords

[No Author keywords available]

Indexed keywords

REDUNDANCY; WEB CRAWLER;

EID: 84055203904     PISSN: None     EISSN: 21508097     Source Type: Conference Proceeding    
DOI: 10.14778/1920841.1920915     Document Type: Article
Times cited : (25)

References (25)
  • 1
    • 12244298488 scopus 로고    scopus 로고
    • Mining reference tables for automatic text segmentation
    • E. Agichtein and V. Ganti. Mining reference tables for automatic text segmentation. In SIGKDD, 2004.
    • (2004) SIGKDD
    • Agichtein, E.1    Ganti, V.2
  • 2
    • 48349101047 scopus 로고    scopus 로고
    • Snowball: extracting relations from large plain-text collections
    • E. Agichtein and L. Gravano. Snowball: extracting relations from large plain-text collections. In ACM DL, 2000.
    • (2000) ACM DL
    • Agichtein, E.1    Gravano, L.2
  • 3
    • 9444260615 scopus 로고
    • Fast algorithms for mining association rules
    • R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In SIGMOD, 1994.
    • (1994) SIGMOD
    • Agrawal, R.1    Srikant, R.2
  • 4
    • 77952372966 scopus 로고    scopus 로고
    • Adaptive duplicate detection using learnable string similarity measures
    • M. Bilenko and R. Mooney. Adaptive duplicate detection using learnable string similarity measures. In SIGKDD, 2003.
    • (2003) SIGKDD
    • Bilenko, M.1    Mooney, R.2
  • 5
    • 0034832365 scopus 로고    scopus 로고
    • Automatic segmentation of text into structured records
    • V. Borkar, K. Deshmukh, and S. Sarawagi. Automatic segmentation of text into structured records. In SIGMOD, 2001.
    • (2001) SIGMOD
    • Borkar, V.1    Deshmukh, K.2    Sarawagi, S.3
  • 6
    • 0003146263 scopus 로고    scopus 로고
    • Extracting Patterns and Relations from the World Wide Web
    • S. Brin. Extracting Patterns and Relations from the World Wide Web. In WebDB, 1998.
    • (1998) WebDB
    • Brin, S.1
  • 7
    • 33749597967 scopus 로고    scopus 로고
    • A primitive operator for similarity joins in data cleaning
    • S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
    • (2006) ICDE
    • Chaudhuri, S.1    Ganti, V.2    Kaushik, R.3
  • 8
    • 0032091575 scopus 로고    scopus 로고
    • Integration of heterogeneous databases without common domains using queries based on textual similarity
    • W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In SIGMOD, 1998.
    • (1998) SIGMOD
    • Cohen, W.1
  • 9
    • 84944327150 scopus 로고    scopus 로고
    • Roadrunner: Towards automatic data extraction from large web sites
    • V. Crescenzi, G. Mecca, and P. Merialdo. Roadrunner: Towards automatic data extraction from large web sites. In VLDB, 2001.
    • (2001) VLDB
    • Crescenzi, V.1    Mecca, G.2    Merialdo, P.3
  • 11
    • 84865659127 scopus 로고    scopus 로고
    • Extracting data records from the web using tag path clusterting
    • G. Miao et al. Extracting data records from the web using tag path clusterting. In WWW, 2009.
    • (2009) WWW
    • Miao, G.1
  • 12
    • 77953053369 scopus 로고    scopus 로고
    • The volume and evolution of web page templates
    • D. Gibson, K. Punera, and A. Tomkins. The volume and evolution of web page templates. In WWW, 2005.
    • (2005) WWW
    • Gibson, D.1    Punera, K.2    Tomkins, A.3
  • 13
    • 72049101978 scopus 로고    scopus 로고
    • Incorporating site-level knowledge to extract structured data from web forums
    • J. Yang et al. Incorporating site-level knowledge to extract structured data from web forums. In WWW, 2009.
    • (2009) WWW
    • Yang, J.1
  • 14
    • 33749623896 scopus 로고    scopus 로고
    • Simultaneous record detection and attribute labeling in web data extraction
    • J. Zhu et al. Simultaneous record detection and attribute labeling in web data extraction. In SIGKDD, 2006.
    • (2006) SIGKDD
    • Zhu, J.1
  • 16
    • 0001776223 scopus 로고    scopus 로고
    • Wrapper induction for information extraction
    • N. Kushmerick, D. S. Weld, and R. Doorenbos. Wrapper induction for information extraction. In IJCAI, 1997.
    • (1997) IJCAI
    • Kushmerick, N.1    Weld, D.S.2    Doorenbos, R.3
  • 17
    • 84880467474 scopus 로고    scopus 로고
    • Text joins in an RDBMS for web data integration
    • L. Gravano et al. Text joins in an RDBMS for web data integration. In WWW, 2003.
    • (2003) WWW
    • Gravano, L.1
  • 18
    • 0031187745 scopus 로고    scopus 로고
    • Block edit models for approximate string matching
    • D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 181(1), 1997.
    • (1997) Theoretical Computer Science , vol.181 , Issue.1
    • Lopresti, D.1    Tomkins, A.2
  • 22
    • 1142279457 scopus 로고    scopus 로고
    • Robust and efficient fuzzy match for online data cleaning
    • S. Chaudhuri et al. Robust and efficient fuzzy match for online data cleaning. In SIGMOD, 2003.
    • (2003) SIGMOD
    • Chaudhuri, S.1
  • 23
    • 0242456811 scopus 로고    scopus 로고
    • Interactive deduplication using active learning
    • S. Sarawagi and A. Bhamidipaty. Interactive deduplication using active learning. In SIGKDD, 2002.
    • (2002) SIGKDD
    • Sarawagi, S.1    Bhamidipaty, A.2
  • 25
    • 33744821948 scopus 로고    scopus 로고
    • Web data extraction based on partial tree assignment
    • Y. Zhai and B. Liu. Web data extraction based on partial tree assignment. In WWW, 2005.
    • (2005) WWW
    • Zhai, Y.1    Liu, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.