-
1
-
-
1142303684
-
Extracting structured data from web
-
New York, NY, USA, ACM Press
-
A. Arasu and H. Garcia-Molina. Extracting structured data from web pages. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 337-348, New York, NY, USA, 2003. ACM Press.
-
(2003)
SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data
, pp. 337-348
-
-
Arasu, A.1
Garcia-Molina, H.2
-
2
-
-
0038589165
-
The anatomy of a large-scale hypertextual web search engine
-
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst., 30(1-7):107-117, 1998.
-
(1998)
Comput. Netw. ISDN Syst
, vol.30
, Issue.1-7
, pp. 107-117
-
-
Brin, S.1
Page, L.2
-
3
-
-
33645316810
-
Link-based similarity measures for the classification of web documents
-
P. Calado, M. Cristo, M. A. Gonçalves, E. S. de Moura, B. RIbeiro-Neto, and N. Ziviani. Link-based similarity measures for the classification of web documents. J. Am. Soc. Inf. Sci. Technol., 57(2):208-221, 2006.
-
(2006)
J. Am. Soc. Inf. Sci. Technol
, vol.57
, Issue.2
, pp. 208-221
-
-
Calado, P.1
Cristo, M.2
Gonçalves, M.A.3
de Moura, E.S.4
RIbeiro-Neto, B.5
Ziviani, N.6
-
4
-
-
34247196659
-
A comparative study of citations and links in document classification
-
New York, NY, USA, ACM Press
-
T. Couto, M. Cristo, M. A. Gonçalves, P. Calado, N. Ziviani, E. Moura, and B. Ribeiro-Neto. A comparative study of citations and links in document classification. In JCDL '06: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, pages 75-84, New York, NY, USA, 2006. ACM Press.
-
(2006)
JCDL '06: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
, pp. 75-84
-
-
Couto, T.1
Cristo, M.2
Gonçalves, M.A.3
Calado, P.4
Ziviani, N.5
Moura, E.6
Ribeiro-Neto, B.7
-
5
-
-
84944327150
-
Roadrunner: Towards automatic data extraction from large web sites
-
San Francisco, CA, USA, Morgan Kaufmann Publishers Inc
-
V. Crescenzi, G. Mecca, and P. Merialdo. Roadrunner: Towards automatic data extraction from large web sites. In VLDB '01: Proceedings of the 27th International Conference on Very Large Data Bases, pages 109-118, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc.
-
(2001)
VLDB '01: Proceedings of the 27th International Conference on Very Large Data Bases
, pp. 109-118
-
-
Crescenzi, V.1
Mecca, G.2
Merialdo, P.3
-
6
-
-
33745697585
-
A knowledge-based approach to citation extraction
-
New York, NY, USA, IEEE Systems, Man, and Cybernetics Society
-
M.-Y. Day, T.-H. Tsai, C.-L. Sung, C.-W. Lee, S.-H. Wu, C.-S. Ong, and W.-L. Hsu. A knowledge-based approach to citation extraction. In IRI '05: Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration, pages 50-55, New York, NY, USA, 2005. IEEE Systems, Man, and Cybernetics Society.
-
(2005)
IRI '05: Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration
, pp. 50-55
-
-
Day, M.-Y.1
Tsai, T.-H.2
Sung, C.-L.3
Lee, C.-W.4
Wu, S.-H.5
Ong, C.-S.6
Hsu, W.-L.7
-
7
-
-
0033225222
-
Conceptual-model-based data extraction from multiple-record web
-
D. W. Embley, D. M. Campbell, Y. S. Jiang, S. W. Liddle, D. W. Lonsdale, Y.-K. Ng, and R. D. Smith. Conceptual-model-based data extraction from multiple-record web pages. Data Knowl. Eng., 31(3):227-251, 1999.
-
(1999)
Data Knowl. Eng
, vol.31
, Issue.3
, pp. 227-251
-
-
Embley, D.W.1
Campbell, D.M.2
Jiang, Y.S.3
Liddle, S.W.4
Lonsdale, D.W.5
Ng, Y.-K.6
Smith, R.D.7
-
10
-
-
84941274546
-
Automatic document metadata extraction using support vector machines
-
IEEE Computer Society
-
H. Han, C. L. Giles, E. Manavoglu, H. Zha, Z. Zhang, and E. A. Fox. Automatic document metadata extraction using support vector machines. In ACM/IEEE Joint Conference on Digital Libraries, JCDL 2003, pages 37-48. IEEE Computer Society, 2003.
-
(2003)
ACM/IEEE Joint Conference on Digital Libraries, JCDL 2003
, pp. 37-48
-
-
Han, H.1
Giles, C.L.2
Manavoglu, E.3
Zha, H.4
Zhang, Z.5
Fox, E.A.6
-
11
-
-
0032309862
-
Generating finite-state transducers for semi-structured data extraction from the web
-
C.-N. Hsu and M.-T. Dung. Generating finite-state transducers for semi-structured data extraction from the web. Inf. Syst., 23(9):521-538, 1998.
-
(1998)
Inf. Syst
, vol.23
, Issue.9
, pp. 521-538
-
-
Hsu, C.-N.1
Dung, M.-T.2
-
12
-
-
27544505132
-
Automatic extraction of titles from general documents using machine learning
-
Tools & techniques: supporting classification
-
Y. Hu, H. Li, Y. Cao, D. Meyerzon, and Q. Zheng. Automatic extraction of titles from general documents using machine learning. In JCDL '05: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, Tools & techniques: supporting classification, pages 145-154, 2005.
-
(2005)
JCDL '05: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries
, pp. 145-154
-
-
Hu, Y.1
Li, H.2
Cao, Y.3
Meyerzon, D.4
Zheng, Q.5
-
13
-
-
0034172374
-
Wrapper induction: Efficiency and expressiveness
-
N. Kushmerick. Wrapper induction: efficiency and expressiveness. Artif. Intell., 118(1-2):15-68, 2000.
-
(2000)
Artif. Intell
, vol.118
, Issue.1-2
, pp. 15-68
-
-
Kushmerick, N.1
-
14
-
-
0036466676
-
Debye - data extraction by example
-
A. H. F. Laender, B. A. Ribeiro-Neto, and A. S. da Silva. Debye - data extraction by example. Data Knowl. Eng., 40(2):121-154, 2002.
-
(2002)
Data Knowl. Eng
, vol.40
, Issue.2
, pp. 121-154
-
-
Laender, A.H.F.1
Ribeiro-Neto, B.A.2
da Silva, A.S.3
-
15
-
-
0037806547
-
A brief survey of web data extraction tools
-
A. H. F. Laender, B. A. Ribeiro-Neto, A. S. da Silva, and J. S. Teixeira. A brief survey of web data extraction tools. SIGMOD Record, 31(2):84-93, 2002.
-
(2002)
SIGMOD Record
, vol.31
, Issue.2
, pp. 84-93
-
-
Laender, A.H.F.1
Ribeiro-Neto, B.A.2
da Silva, A.S.3
Teixeira, J.S.4
-
16
-
-
0032640910
-
Digital libraries and autonomous citation indexing
-
S. Lawrence, C. L. Giles, and K. Bollacker. Digital libraries and autonomous citation indexing. Computer, 32(6):67-71, 1999.
-
(1999)
Computer
, vol.32
, Issue.6
, pp. 67-71
-
-
Lawrence, S.1
Giles, C.L.2
Bollacker, K.3
-
17
-
-
36348994653
-
Are your citations clean? new scenarios and challenges in maintaining digital libraries
-
To appear in
-
D. Lee, J. Kang, P. Mitra, C. L. Giles, and B.-W. On. Are your citations clean? new scenarios and challenges in maintaining digital libraries. To appear in Communications of the ACM, 2007.
-
(2007)
Communications of the ACM
-
-
Lee, D.1
Kang, J.2
Mitra, P.3
Giles, C.L.4
On, B.-W.5
-
18
-
-
77952333945
-
Mining data records in web
-
New York, NY, USA, ACM Press
-
B. Liu, R. Grossman, and Y. Zhai. Mining data records in web pages. In KDD '03: Proceedings of the ninth A CM SIGKDD international conference on Knowledge discovery and data mining, pages 601-606, New York, NY, USA, 2003. ACM Press.
-
(2003)
KDD '03: Proceedings of the ninth A CM SIGKDD international conference on Knowledge discovery and data mining
, pp. 601-606
-
-
Liu, B.1
Grossman, R.2
Zhai, Y.3
-
19
-
-
33947169439
-
Labrador: Efficiently publishing relational databases on the web by using keyword-based query interfaces
-
Article in Press, Corrected Proof
-
F. Mesquita, A. S. da Silva, E. S. de Moura, P. Calado, and A. H. F. Laender. Labrador: Efficiently publishing relational databases on the web by using keyword-based query interfaces. Information Processing & Management, 2007. Article in Press, Corrected Proof.
-
(2007)
Information Processing & Management
-
-
Mesquita, F.1
da Silva, A.S.2
de Moura, E.S.3
Calado, P.4
Laender, A.H.F.5
-
20
-
-
0035587215
-
Hierarchical wrapper induction for semistructured information sources
-
I. Muslea, S. Minton, and C. A. Knoblock. Hierarchical wrapper induction for semistructured information sources. Autonomous Agents and Multi-Agent Systems, 4(1-2):93-114, 2001.
-
(2001)
Autonomous Agents and Multi-Agent Systems
, vol.4
, Issue.1-2
, pp. 93-114
-
-
Muslea, I.1
Minton, S.2
Knoblock, C.A.3
-
21
-
-
23844500283
-
-
G. W. Paynter. Developing practical automatic metadata assignment and evaluation tools for internet resources. In M. Marlino, T. Sumner, and F. M. S. III, editors, ACM/IEEE Joint Conference on Digital Libraries. JCDL 2005. Denver, CA, USA, June 7-11. 2005, Proceedings, pages 291-300. ACM, 2005.
-
G. W. Paynter. Developing practical automatic metadata assignment and evaluation tools for internet resources. In M. Marlino, T. Sumner, and F. M. S. III, editors, ACM/IEEE Joint Conference on Digital Libraries. JCDL 2005. Denver, CA, USA, June 7-11. 2005, Proceedings, pages 291-300. ACM, 2005.
-
-
-
-
22
-
-
4644340823
-
Automatic web news extraction using tree edit distance
-
New York, NY, USA, ACM Press
-
D. C. Reis, P. B. Golgher, A. S. Silva, and A. F. Laender. Automatic web news extraction using tree edit distance. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 502-511, New York, NY, USA, 2004. ACM Press.
-
(2004)
WWW '04: Proceedings of the 13th international conference on World Wide Web
, pp. 502-511
-
-
Reis, D.C.1
Golgher, P.B.2
Silva, A.S.3
Laender, A.F.4
-
23
-
-
0032624184
-
Learning information extraction rules for semi-structured and free text
-
S. Soderland. Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1-3):233-272, 1999.
-
(1999)
Machine Learning
, vol.34
, Issue.1-3
, pp. 233-272
-
-
Soderland, S.1
-
24
-
-
4944255256
-
-
O. Yilmazel, Finneran, C. M., Liddy, and E. D. Metaextract: an NLP system to automatically assign metadata. In JCDL'04: Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries, Collaboration and group work, pages 241-242, 2004.
-
O. Yilmazel, Finneran, C. M., Liddy, and E. D. Metaextract: an NLP system to automatically assign metadata. In JCDL'04: Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries, Collaboration and group work, pages 241-242, 2004.
-
-
-
|