-
1
-
-
74549119410
-
-
http://www.w3.org/dom/.
-
-
-
-
2
-
-
1142303684
-
Extracting structured data from web
-
A. Arasu and H. Garcia-Molina. Extracting structured data from web pages. In SIGMOD, pages 337 - 348.
-
SIGMOD
, pp. 337-348
-
-
Arasu, A.1
Garcia-Molina, H.2
-
3
-
-
85042021254
-
Iepad: Information extraction based on pattern discovery
-
C.-H. Chang and S.-C. Lui. Iepad: information extraction based on pattern discovery. In WWW-2001, pages 681-688.
-
(2001)
, pp. 681-688
-
-
Chang, C.-H.1
Lui, S.-C.2
-
5
-
-
77953046656
-
A flexible learning system for wrapping tables and lists in html documents
-
W. W. Cohen, M. Hurst, and L. S. Jensen. A flexible learning system for wrapping tables and lists in html documents. In WWW-2002, pages 232 - 241.
-
(2002)
, pp. 232-241
-
-
Cohen, W.W.1
Hurst, M.2
Jensen, L.S.3
-
6
-
-
84944327150
-
Roadrunner: Towards automatic data extraction from large web sites
-
V. Crescenzi, G. Mecca, and P. Merialdo. Roadrunner: Towards automatic data extraction from large web sites. In VLDB-2001, pages 109 - 118.
-
(2001)
VLDB
, pp. 109-118
-
-
Crescenzi, V.1
Mecca, G.2
Merialdo, P.3
-
7
-
-
3142764439
-
Web wrapper induction: A brief survey
-
S. Flesca, G. Manco, E. Masciari, E. Rende, and A. Tagarelli. Web wrapper induction: a brief survey. AI Communications, 17:57 - 61, 2004.
-
(2004)
AI Communications
, vol.17
, pp. 57-61
-
-
Flesca, S.1
Manco, G.2
Masciari, E.3
Rende, E.4
Tagarelli, A.5
-
8
-
-
30544447615
-
Thresher: Automating the unwrapping of semantic content from the world wide web
-
A. Hogue and D. Karger. Thresher: automating the unwrapping of semantic content from the world wide web. In WWW-2005, pages 86 - 95.
-
(2005)
, pp. 86-95
-
-
Hogue, A.1
Karger, D.2
-
9
-
-
0032309862
-
Generating finite-state transducers for semi-structured data extraction from the web
-
C.-N. Hsu and M.-T. Dung. Generating finite-state transducers for semi-structured data extraction from the web. Information Systems, Special Issue on Semistructured Data, 23(8):521-538, 1998.
-
(1998)
Information Systems, Special Issue on Semistructured Data
, vol.23
, Issue.8
, pp. 521-538
-
-
Hsu, C.-N.1
Dung, M.-T.2
-
10
-
-
0001776223
-
Wrapper induction for information extraction
-
N. Kushmerick, D. S. Weld, and R. B. Doorenbos. Wrapper induction for information extraction. In IJCAI-1997, pages 729-737, 1997.
-
(1997)
IJCAI-1997
, pp. 729-737
-
-
Kushmerick, N.1
Weld, D.S.2
Doorenbos, R.B.3
-
11
-
-
0037806547
-
A brief survey of web data extraction tools
-
A. H. F. Laender, B. A. Ribeiro-Neto, A. S. da Silva, and J. S. Teixeira. A brief survey of web data extraction tools. SIGMOD Record, 31(2):84-93, 2002.
-
(2002)
SIGMOD Record
, vol.31
, Issue.2
, pp. 84-93
-
-
Laender, A.H.F.1
Ribeiro-Neto, B.A.2
da Silva, A.S.3
Teixeira, J.S.4
-
13
-
-
77952333945
-
Mining data records in web
-
B. Liu, R. Grossman, and Y. Zhai. Mining data records in web pages. In SIGKDD-2003, pages 601 - 606.
-
(2003)
SIGKDD
, pp. 601-606
-
-
Liu, B.1
Grossman, R.2
Zhai, Y.3
-
14
-
-
0033893885
-
Xwrap: An xml-enabled wrapper construction system for web information sources
-
L. Liu, C. Pu, and W. Han. Xwrap: an xml-enabled wrapper construction system for web information sources. In ICDE-2000, pages 611-621, 2000.
-
(2000)
ICDE-2000
, pp. 611-621
-
-
Liu, L.1
Pu, C.2
Han, W.3
-
16
-
-
38549134414
-
Object-level vertical search
-
Z. Nie, J.-R. Wen, and W.-Y. Ma. Object-level vertical search. In CIDR-2007, pages 235-246.
-
(2007)
CIDR
, pp. 235-246
-
-
Nie, Z.1
Wen, J.-R.2
Ma, W.-Y.3
-
17
-
-
4644340823
-
Automatic web news extraction using tree edit distance
-
D. C. Reis, P. B. Golgher, A. S. Silva, and A. F. Laender. Automatic web news extraction using tree edit distance. In WWW-2004, pages 502 - 511.
-
(2004)
, pp. 502-511
-
-
Reis, D.C.1
Golgher, P.B.2
Silva, A.S.3
Laender, A.F.4
-
18
-
-
74549194383
-
Automation in information extraction and data integration (tutorial)
-
S. Sarawagi. Automation in information extraction and data integration (tutorial). In VLDB-2002.
-
VLDB-2002
-
-
Sarawagi, S.1
-
19
-
-
84880476173
-
Data extraction and label assignment for web databases
-
J. Wang and F. H. Lochovsky. Data extraction and label assignment for web databases. In WWW-2003, pages 187 - 196.
-
(2003)
, pp. 187-196
-
-
Wang, J.1
Lochovsky, F.H.2
-
20
-
-
3543147086
-
Recent trends in hierarchic document clustering: A critical review
-
P. Willett. Recent trends in hierarchic document clustering: a critical review. Information Processing and Management, 24(5):577-597, 1988.
-
(1988)
Information Processing and Management
, vol.24
, Issue.5
, pp. 577-597
-
-
Willett, P.1
-
21
-
-
33744511796
-
Fully automatic wrapper generation for search engines
-
H. Zhao, W. Meng, Z. Wu, V. Raghavan, and C. Yu. Fully automatic wrapper generation for search engines. In WWW-2005, pages 66 - 75.
-
(2005)
, pp. 66-75
-
-
Zhao, H.1
Meng, W.2
Wu, Z.3
Raghavan, V.4
Yu, C.5
-
22
-
-
65449126734
-
Pictor: An interactive system for importing data from a website
-
S. Zheng, M. R. Scott, R. Song, and J.-R. Wen. Pictor: An interactive system for importing data from a website. In SIGKDD-2008, 2008.
-
(2008)
SIGKDD
, pp. 2008
-
-
Zheng, S.1
Scott, M.R.2
Song, R.3
Wen, J.-R.4
-
23
-
-
36849062139
-
Joint optimization of wrapper generation and template detection
-
S. Zheng, R. Song, D. Wu, and J.-R. Wen. Joint optimization of wrapper generation and template detection. In SIGKDD-2007.
-
SIGKDD-2007
-
-
Zheng, S.1
Song, R.2
Wu, D.3
Wen, J.-R.4
|