메뉴 건너뛰기




Volumn , Issue , 2011, Pages 437-446

Highly efficient algorithms for structural clustering of large websites

Author keywords

Information extraction; Minimum description length; Structural clustering

Indexed keywords

INFORMATION EXTRACTION; MINIMUM DESCRIPTION LENGTH; SCALABLE ALGORITHMS; STRUCTURAL CLUSTERING;

EID: 84861065882     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1963405.1963468     Document Type: Conference Paper
Times cited : (45)

References (26)
  • 1
    • 36849071950 scopus 로고    scopus 로고
    • Xproj: A framework for projected structural clustering of xml documents
    • C. C. Aggarwal, N. Ta, J. Wang, J. Feng, and M. Zaki. Xproj: a framework for projected structural clustering of xml documents. In KDD, pages 46-55, 2007.
    • (2007) KDD , pp. 46-55
    • Aggarwal, C.C.1    Ta, N.2    Wang, J.3    Feng, J.4    Zaki, M.5
  • 2
    • 70350652136 scopus 로고    scopus 로고
    • Xpath-wrapper induction by generating tree traversal patterns
    • T. Anton. Xpath-wrapper induction by generating tree traversal patterns. In LWA, pages 126-133, 2005.
    • (2005) LWA , pp. 126-133
    • Anton, T.1
  • 3
    • 84944318551 scopus 로고    scopus 로고
    • Visual web information extraction with lixto
    • R. Baumgartner, S. Flesca, and G. Gottlob. Visual web information extraction with lixto. In VLDB, pages 119-128, 2001.
    • (2001) VLDB , pp. 119-128
    • Baumgartner, R.1    Flesca, S.2    Gottlob, G.3
  • 4
    • 84934335743 scopus 로고    scopus 로고
    • Inapproximability results for bounded variants of optimization problems
    • M. Chlebík and J. Chlebíková. Inapproximability results for bounded variants of optimization problems. Fundamentals of Computation Theory, 2751:123-145, 2003.
    • (2003) Fundamentals of Computation Theory , vol.2751 , pp. 123-145
    • Chlebík, M.1    Chlebíková, J.2
  • 5
    • 35048867075 scopus 로고    scopus 로고
    • A tree-based approach to clustering xml documents by structure
    • G. Costa, G. Manco, R. Ortale, and A. Tagarelli. A tree-based approach to clustering xml documents by structure. In PKDD, pages 137-148, 2004.
    • (2004) PKDD , pp. 137-148
    • Costa, G.1    Manco, G.2    Ortale, R.3    Tagarelli, A.4
  • 6
    • 84944327150 scopus 로고    scopus 로고
    • Roadrunner: Towards automatic data extraction from large web sites
    • V. Crescenzi, G. Mecca, and P. Merialdo. Roadrunner: Towards automatic data extraction from large web sites. In VLDB, pages 109-118, 2001.
    • (2001) VLDB , pp. 109-118
    • Crescenzi, V.1    Mecca, G.2    Merialdo, P.3
  • 9
    • 29144484106 scopus 로고    scopus 로고
    • A methodology for clustering xml documents by structure
    • T. Dalamagas, T. Cheng, K.-J. Winkel, and T. Sellis. A methodology for clustering xml documents by structure. Inf. Syst., 31(3):187-228, 2006.
    • (2006) Inf. Syst. , vol.31 , Issue.3 , pp. 187-228
    • Dalamagas, T.1    Cheng, T.2    Winkel, K.-J.3    Sellis, T.4
  • 10
    • 70849104261 scopus 로고    scopus 로고
    • Robust web extraction: An approach based on a probabilistic tree-edit model
    • N. Dalvi, P. Bohannon, and F. Sha. Robust web extraction: An approach based on a probabilistic tree-edit model. In SIGMOD, pages 335-348, 2009.
    • (2009) SIGMOD , pp. 335-348
    • Dalvi, N.1    Bohannon, P.2    Sha, F.3
  • 12
    • 77954301186 scopus 로고    scopus 로고
    • Harvesting relational tables from lists on the web
    • H. Elmeleegy, J. Madhavan, and A. Y. Halevy. Harvesting relational tables from lists on the web. PVLDB, 2(1):1078-1089, 2009.
    • (2009) PVLDB , vol.2 , Issue.1 , pp. 1078-1089
    • Elmeleegy, H.1    Madhavan, J.2    Halevy, A.Y.3
  • 13
    • 41849126735 scopus 로고    scopus 로고
    • Clustering template based web documents
    • T. Gottron. Clustering template based web documents. In ECIR, pages 40-51, 2008.
    • (2008) ECIR , pp. 40-51
    • Gottron, T.1
  • 15
    • 84055203904 scopus 로고    scopus 로고
    • Exploiting content redundancy for web information extraction
    • P. Gulhane, R. Rastogi, S. Sengamedu, and A. Tengli. Exploiting content redundancy for web information extraction. In VLDB, 2010.
    • (2010) VLDB
    • Gulhane, P.1    Rastogi, R.2    Sengamedu, S.3    Tengli, A.4
  • 16
    • 79952384867 scopus 로고    scopus 로고
    • Answering table augmentation queries from unstructured lists on the web
    • R. Gupta and S. Sarawagi. Answering table augmentation queries from unstructured lists on the web. In VLDB, 2009.
    • (2009) VLDB
    • Gupta, R.1    Sarawagi, S.2
  • 17
    • 0002985122 scopus 로고    scopus 로고
    • Wrapping web data into XML
    • W. Han, D. Buttler, and C. Pu. Wrapping web data into XML. SIGMOD Record, 30(3):33-38, 2001.
    • (2001) SIGMOD Record , vol.30 , Issue.3 , pp. 33-38
    • Han, W.1    Buttler, D.2    Pu, C.3
  • 18
    • 0032309862 scopus 로고    scopus 로고
    • Generating finite-state transducers for semi-structured data extraction from the web
    • C.-N. Hsu and M.-T. Dung. Generating finite-state transducers for semi-structured data extraction from the web. Information Systems, 23(8):521-538, 1998.
    • (1998) Information Systems , vol.23 , Issue.8 , pp. 521-538
    • Hsu, C.-N.1    Dung, M.-T.2
  • 20
    • 0001776223 scopus 로고    scopus 로고
    • Wrapper induction for information extraction
    • N. Kushmerick, D. S. Weld, and R. B. Doorenbos. Wrapper induction for information extraction. In IJCAI, pages 729-737, 1997.
    • (1997) IJCAI , pp. 729-737
    • Kushmerick, N.1    Weld, D.S.2    Doorenbos, R.B.3
  • 21
    • 0037481024 scopus 로고    scopus 로고
    • Xclust: Clustering xml schemas for effective integration
    • M. L. Lee, L. H. Yang, W. Hsu, and X. Yang. Xclust: clustering xml schemas for effective integration. In CIKM, pages 292-299, 2002.
    • (2002) CIKM , pp. 292-299
    • Lee, M.L.1    Yang, L.H.2    Hsu, W.3    Yang, X.4
  • 22
  • 26
    • 0002763572 scopus 로고    scopus 로고
    • Building light-weight wrappers for legacy web data-sources using W4F
    • A. Sahuguet and F. Azavant. Building light-weight wrappers for legacy web data-sources using W4F. In VLDB, pages 738-741, 1999.
    • (1999) VLDB , pp. 738-741
    • Sahuguet, A.1    Azavant, F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.