-
1
-
-
37349086786
-
Extracting lists of data records from semi-structured web
-
Manuel Álvarez, Alberto Pan, Juan Raposo, Fernando Bellas and Fidel Cacheda, Extracting lists of data records from semi-structured web pages, Data Knowl Eng 64(2) (2008), 491-509.
-
(2008)
Data Knowl Eng
, vol.64
, Issue.2
, pp. 491-509
-
-
Álvarez, M.1
Pan, A.2
Raposo, J.3
Bellas, F.4
Cacheda, F.5
-
2
-
-
33746744370
-
Exploring new frontiers of web data extraction
-
The lixto project:, Springer-Verlag Berlin Heidelberg
-
Julien Carme, Michal Ceresna, Oliver Frlich, Georg Gottlob, Tamir Hassan, Marcus Herzog, Wolfgang Holzinger and Bernhard Krpl, The lixto project: Exploring new frontiers of web data extraction, in: BNCOD 2006, Springer-Verlag Berlin Heidelberg, 2006, pp. 1-15.
-
(2006)
BNCOD 2006
, pp. 1-15
-
-
Carme, J.1
Ceresna, M.2
Frlich, O.3
Gottlob, G.4
Hassan, T.5
Herzog, M.6
Holzinger, W.7
Krpl, B.8
-
3
-
-
2442546444
-
Probe, cluster, and discover: Focused extraction of qa-pagelets from the deep web
-
James Caverlee, Ling Liu and David Buttler. Probe, cluster, and discover: Focused extraction of qa-pagelets from the deep web, in: ICDE, 2004, pp. 103-115.
-
(2004)
ICDE
, pp. 103-115
-
-
Caverlee, J.1
Liu, L.2
Buttler, D.3
-
4
-
-
33748336500
-
A survy of web information extraction systems
-
Chia-hui Chang, Mohammed Kayed, Moheb Ramzy Girgis, and khaled Shaalan. A survy of web information extraction systems, IEEE transactions on Knowledge and Data Engineering 18(10) (2006), 1411-1428.
-
(2006)
IEEE transactions on Knowledge and Data Engineering
, vol.18
, Issue.10
, pp. 1411-1428
-
-
hui Chang, C.1
Kayed, M.2
Ramzy Girgis, M.3
khaled Shaalan4
-
5
-
-
0002607026
-
Bayesian classification (au-toclass): Theory and results
-
American Association for Artificial Intelligence USA
-
Peter Cheeseman and John Stutz. Bayesian classification (au-toclass): Theory and results, in: Advances in Knowledge Discovery and Data Mining, American Association for Artificial Intelligence USA, 1996, pp. 153-180.
-
(1996)
Advances in Knowledge Discovery and Data Mining
, pp. 153-180
-
-
Cheeseman, P.1
Stutz, J.2
-
7
-
-
4644340823
-
-
Davi de Castro Reis, Paulo Braz Golgher, Altigran Soares da Silva and Alberto H.F. Laender, Automatic web news extraction using tree edit distance, in: WWW, 2004, pp. 502-511.
-
Davi de Castro Reis, Paulo Braz Golgher, Altigran Soares da Silva and Alberto H.F. Laender, Automatic web news extraction using tree edit distance, in: WWW, 2004, pp. 502-511.
-
-
-
-
9
-
-
38149070921
-
Automatic data record detection in web
-
Xiaoying Gao, Le Phong Bao Vuong and Mengjie Zhang. Automatic data record detection in web pages, in: KSEM, 2007, pp. 349-361.
-
(2007)
KSEM
, pp. 349-361
-
-
Gao, X.1
Phong, L.2
Vuong, B.3
Zhang, M.4
-
10
-
-
4644285655
-
Automatic pattern construction for web information extraction
-
4
-
Xiaoying Gao, Mengjie Zhang and Peter Andreae, Automatic pattern construction for web information extraction, in: International Journal of Uncertainty, Fuziness, and Knowledge Based Systems (Vol. 12)(4), 2004, pp. 447-470.
-
(2004)
International Journal of Uncertainty, Fuziness, and Knowledge Based Systems
, vol.12
, pp. 447-470
-
-
Gao, X.1
Zhang, M.2
Andreae, P.3
-
11
-
-
50249174308
-
Overview of autofeed: An unsupervised learning system for generating webfeeds
-
Bora Gazen and Steven Minton. Overview of autofeed: An unsupervised learning system for generating webfeeds, in: AAAI, 2006.
-
(2006)
AAAI
-
-
Gazen, B.1
Minton, S.2
-
12
-
-
84893405732
-
Data clustering: A review
-
A.K. Jain, M.N. Murty and P.J. Flynn, Data clustering: a review, ACM Computing Surveys 31(3) (1999), 264-323.
-
(1999)
ACM Computing Surveys
, vol.31
, Issue.3
, pp. 264-323
-
-
Jain, A.K.1
Murty, M.N.2
Flynn, P.J.3
-
15
-
-
0037806547
-
-
Alberto H.F. Laender, BerthierA.Ribeiro-Neto, Altigrams.da Silva and Juliana S. Teixeira, A brief survy of web data extraction tools, SIGMOD Record 31(2) (2002).
-
Alberto H.F. Laender, BerthierA.Ribeiro-Neto, Altigrams.da Silva and Juliana S. Teixeira, A brief survy of web data extraction tools, SIGMOD Record 31(2) (2002).
-
-
-
-
19
-
-
50249091388
-
-
Repository of online information sources used in information extraction tasks, 2005
-
Ion Muslea, Repository of online information sources used in information extraction tasks. www.isi.edu/info-agents/rise/repository.html, 2005.
-
Ion Muslea
-
-
-
21
-
-
0032684968
-
A hierarchical approach to wrapper induction, in: Oren Etzioni, Jörg P. Müller and Jeffrey M
-
Bradshaw, eds, Seattle, WA, USA, ACM Press, pp
-
Ion Muslea, Steve Minton and Craig Knoblock, A hierarchical approach to wrapper induction, in: Oren Etzioni, Jörg P. Müller and Jeffrey M. Bradshaw, eds, Proceedings of the Third International Conference on Autonomous Agents (Agents'99), Seattle, WA, USA, 1999. ACM Press, pp. 190-197.
-
(1999)
Proceedings of the Third International Conference on Autonomous Agents (Agents'99)
, pp. 190-197
-
-
Muslea, I.1
Minton, S.2
Knoblock, C.3
-
23
-
-
30344457312
-
Stavies: A system for information extraction from unknown web data sources through automatic web wrapper generation using clustering techniques
-
Nikolaos Papadakis, Dimitrios Skoutas, Konstantinos Raftopoulos and Theodora A. Varvarigou, Stavies: A system for information extraction from unknown web data sources through automatic web wrapper generation using clustering techniques, IEEE Trans Knowl Data Eng 17(12) (2005), 1638-1652.
-
(2005)
IEEE Trans Knowl Data Eng
, vol.17
, Issue.12
, pp. 1638-1652
-
-
Papadakis, N.1
Skoutas, D.2
Raftopoulos, K.3
Varvarigou, T.A.4
-
25
-
-
0016572913
-
A vector space model for automatic indexing
-
G. Salton, A. Wong and C.S. Yang, A vector space model for automatic indexing, Commun ACM 18(11) (1975), 613-620.
-
(1975)
Commun ACM
, vol.18
, Issue.11
, pp. 613-620
-
-
Salton, G.1
Wong, A.2
Yang, C.S.3
-
26
-
-
0019887799
-
Identification of common molecular subsequences
-
T.F. Smith and M.S. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology 147 (1981), 195-197.
-
(1981)
Journal of Molecular Biology
, vol.147
, pp. 195-197
-
-
Smith, T.F.1
Waterman, M.S.2
-
27
-
-
33747656968
-
-
Jordi Turbo, Alicia Ageno and Neus Catala, Adaptive information extraction, ACM Computing Surveys 38(2) (2006).
-
Jordi Turbo, Alicia Ageno and Neus Catala, Adaptive information extraction, ACM Computing Surveys 38(2) (2006).
-
-
-
-
29
-
-
42549093993
-
Data extraction from semi-structured web pages by clustering
-
LePhong Bao Vuong, Xiaoying Gao and Mengjie Zhang, Data extraction from semi-structured web pages by clustering, in: Web Intelligence, 2006, pp. 374-377.
-
(2006)
Web Intelligence
, pp. 374-377
-
-
Bao Vuong, L.1
Gao, X.2
Zhang, M.3
-
31
-
-
33744821948
-
Web data extraction based on partial tree alignment
-
New York, NY, USA, ACM, pp
-
Yanhong Zhai and Bing Liu, Web data extraction based on partial tree alignment, in: WWW '05: Proceedings of the 14th international conference on World Wide Web, New York, NY, USA, 2005. ACM, pp. 76-85.
-
(2005)
WWW '05: Proceedings of the 14th international conference on World Wide Web
, pp. 76-85
-
-
Zhai, Y.1
Liu, B.2
|