메뉴 건너뛰기




Volumn , Issue , 2008, Pages 66-71

Bridging the gap: From multi document template detection to single document content extraction

Author keywords

Content extraction; Template clustering; Template detection; Web mining

Indexed keywords

MULTIMEDIA SYSTEMS; SIGNAL DETECTION; VISUAL COMMUNICATION; WEBSITES;

EID: 62649093456     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (8)

References (15)
  • 1
    • 77953052174 scopus 로고    scopus 로고
    • Template detection via data mining and its applications
    • New York, NY, USA, ACM Press
    • Z. Bar-Yossef and S. Rajagopalan. Template detection via data mining and its applications. In Proc. 11th Int. Conf. on WWW, pages 580-591, New York, NY, USA, 2002. ACM Press.
    • (2002) Proc. 11th Int. Conf. on WWW , pp. 580-591
    • Bar-Yossef, Z.1    Rajagopalan, S.2
  • 2
    • 10944246083 scopus 로고    scopus 로고
    • On the complexity of schema inference from web pages in the presence of nullable data attributes
    • New York, NY, USA, ACM Press
    • G. Yang, I. V. Ramakrishnan, and M. Kifer. On the complexity of schema inference from web pages in the presence of nullable data attributes. In Proc. 12th Int. Conf. on Information and Knowledge Management, pages 224-231, New York, NY, USA, 2003. ACM Press.
    • (2003) Proc. 12th Int. Conf. on Information and Knowledge Management , pp. 224-231
    • Yang, G.1    Ramakrishnan, I.V.2    Kifer, M.3
  • 4
    • 26844469211 scopus 로고    scopus 로고
    • Automatic extraction of informative blocks from webpages
    • New York, NY, USA, ACM Press
    • S. Debnath, P. Mitra, and C. L. Giles. Automatic extraction of informative blocks from webpages. In Proc. 2005 ACM Symp. on Applied Computing, pages 1722-1726, New York, NY, USA, 2005. ACM Press.
    • (2005) Proc. 2005 ACM Symp. on Applied Computing , pp. 1722-1726
    • Debnath, S.1    Mitra, P.2    Giles, C.L.3
  • 6
    • 4644340823 scopus 로고    scopus 로고
    • Automatic web news extraction using tree edit distance
    • New York, NY, USA, ACM Press
    • D. C. Reis, P. B. Golgher, A. S. Silva, and A. F. Laender. Automatic web news extraction using tree edit distance. In Proc. 13th Int. Conf. on WWW, pages 502-511, New York, NY, USA, 2004. ACM Press.
    • (2004) Proc. 13th Int. Conf. on WWW , pp. 502-511
    • Reis, D.C.1    Golgher, P.B.2    Silva, A.S.3    Laender, A.F.4
  • 7
    • 35348883378 scopus 로고    scopus 로고
    • Pagelevel template detection via isotonic smoothing
    • New York, NY, USA, ACM Press
    • D. Chakrabarti, R. Kumar, and K. Punera. Pagelevel template detection via isotonic smoothing. In Proc.16th Int. Conf. on WWW, pages 61-70, New York, NY, USA, 2007. ACM Press.
    • (2007) Proc.16th Int. Conf. on WWW , pp. 61-70
    • Chakrabarti, D.1    Kumar, R.2    Punera, K.3
  • 9
    • 84880498138 scopus 로고    scopus 로고
    • DOM-based content extraction of HTML documents
    • New York, NY, USA, ACM Press
    • S. Gupta, G. Kaiser, D. Neistadt, and P. Grimm. DOM-based content extraction of HTML documents. In Proc. 12th Int. Conf. on WWW, pages 207-214, New York, NY, USA, 2003. ACM Press.
    • (2003) Proc. 12th Int. Conf. on WWW , pp. 207-214
    • Gupta, S.1    Kaiser, G.2    Neistadt, D.3    Grimm, P.4
  • 10
    • 26944496810 scopus 로고    scopus 로고
    • Identifying content blocks from web documents
    • Foundations of Intelligent Systems
    • S. Debnath, P. Mitra, and C. L. Giles. Identifying content blocks from web documents. In Foundations of Intelligent Systems, LNCS, pages 285-293, 2005.
    • (2005) LNCS , pp. 285-293
    • Debnath, S.1    Mitra, P.2    Giles, C.L.3
  • 14
    • 12744279236 scopus 로고    scopus 로고
    • A short survey of document structure similarity algorithms
    • CSREA Press
    • D. Buttler. A short survey of document structure similarity algorithms. In Proc. Int. Conf. on Internet Computing, pages 3-9. CSREA Press, 2004.
    • (2004) Proc. Int. Conf. on Internet Computing , pp. 3-9
    • Buttler, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.