-
1
-
-
0032092761
-
NoDoSE-a tool for semi-automatically extracting structured and semistructured data from text documents
-
Adelberg, B.: NoDoSE-a tool for semi-automatically extracting structured and semistructured data from text documents. SIGMOD Rec. 27(2), 283-294 (1998)
-
(1998)
SIGMOD Rec.
, vol.27
, Issue.2
, pp. 283-294
-
-
Adelberg, B.1
-
2
-
-
84944318551
-
Visual Web Information Extraction with Lixto
-
Morgan Kaufmann Publishers, San Francisco
-
Baumgartner, R., Flesca, S., Gottlob, G.: Visual Web Information Extraction with Lixto. In: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 119-128. Morgan Kaufmann Publishers, San Francisco (2001)
-
(2001)
Proceedings of the 27th International Conference on Very Large Data Bases
, pp. 119-128
-
-
Baumgartner, R.1
Flesca, S.2
Gottlob, G.3
-
3
-
-
84867829697
-
Web data extraction system
-
Springer
-
Baumgartner, R., Gatterbauer, W., Gottlob, G.: Web data extraction system. In: Encyclopedia of Database Systems, pp. 3465-3471. Springer (2009)
-
(2009)
Encyclopedia of Database Systems
, pp. 3465-3471
-
-
Baumgartner, R.1
Gatterbauer, W.2
Gottlob, G.3
-
4
-
-
84944327150
-
Roadrunner: Towards automatic data extraction from large web sites
-
Crescenzi, V., Mecca, G., Merialdo, P.: Roadrunner: Towards automatic data extraction from large web sites. In: Proceedings of the International Conference on Very Large Data Bases, pp. 109-118 (2001)
-
(2001)
Proceedings of the International Conference on Very Large Data Bases
, pp. 109-118
-
-
Crescenzi, V.1
Mecca, G.2
Merialdo, P.3
-
6
-
-
84859918687
-
Incorporating non-local information into information extraction systems by gibbs sampling
-
Association for Computational Linguistics
-
Finkel, J., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363-370. Association for Computational Linguistics (2005)
-
(2005)
Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
, pp. 363-370
-
-
Finkel, J.1
Grenager, T.2
Manning, C.3
-
7
-
-
54049110052
-
Classification of documents based on the structure of their DOM trees
-
Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. Springer, Heidelberg
-
Geibel, P., Pustylnikov, O., Mehler, A., Gust, H., Kühnberger, K.-U.: Classification of documents based on the structure of their DOM trees. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. LNCS, vol. 4985, pp. 779-788. Springer, Heidelberg (2008)
-
(2008)
LNCS
, vol.4985
, pp. 779-788
-
-
Geibel, P.1
Pustylnikov, O.2
Mehler, A.3
Gust, H.4
Kühnberger, K.-U.5
-
8
-
-
77950904942
-
Boilerplate detection using shallow text features
-
ACM, New York
-
Kohlschütter, C., Fankhauser, P., Nejdl, W.: Boilerplate detection using shallow text features. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM 2010, pp. 441-450. ACM, New York (2010)
-
(2010)
Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM 2010
, pp. 441-450
-
-
Kohlschütter, C.1
Fankhauser, P.2
Nejdl, W.3
-
9
-
-
0034172374
-
Wrapper induction: Efficiency and expressiveness
-
Kushmerick, N.: Wrapper induction: Efficiency and expressiveness. Artificial Intelligence 118(1), 15-68 (2000)
-
(2000)
Artificial Intelligence
, vol.118
, Issue.1
, pp. 15-68
-
-
Kushmerick, N.1
-
10
-
-
0037806547
-
A brief survey of web data extraction tools
-
Laender, A., Ribeiro-Neto, B., Da Silva, A., Teixeira, J.: A brief survey of web data extraction tools. ACM Sigmod Record 31(2), 84-93 (2002)
-
(2002)
ACM Sigmod Record
, vol.31
, Issue.2
, pp. 84-93
-
-
Laender, A.1
Ribeiro-Neto, B.2
Da Silva, A.3
Teixeira, J.4
-
11
-
-
0033893885
-
XWrap: An extensible wrapper construction system for internet information
-
IEEE
-
Liu, L., Pu, C., Han, W.: XWrap: An extensible wrapper construction system for internet information. In: Proceedings of the 16th International Conference on Data Engineering (ICDE 2000), San Diego, CA, pp. 611-621. IEEE (2000)
-
(2000)
Proceedings of the 16th International Conference on Data Engineering (ICDE 2000), San Diego, CA
, pp. 611-621
-
-
Liu, L.1
Pu, C.2
Han, W.3
-
12
-
-
0035587215
-
Hierarchical wrapper induction for semistructured information sources
-
Muslea, I., Minton, S., Knoblock, C.: Hierarchical wrapper induction for semistructured information sources. Autonomous Agents and Multi-Agent Systems 4(1), 93-114 (2001)
-
(2001)
Autonomous Agents and Multi-Agent Systems
, vol.4
, Issue.1
, pp. 93-114
-
-
Muslea, I.1
Minton, S.2
Knoblock, C.3
-
13
-
-
84879966114
-
Archiving data objects using Web feeds
-
Oita, M., Senellart, P.: Archiving data objects using Web feeds. In: Proceedings of International Web Archiving Workshop, Vienna, Austria, pp. 31-41 (2010)
-
(2010)
Proceedings of International Web Archiving Workshop, Vienna, Austria
, pp. 31-41
-
-
Oita, M.1
Senellart, P.2
-
14
-
-
84879980737
-
ArchivePress: A Really Simple Solution to Archiving Blog Content
-
Pennock, M., Davis, R.: ArchivePress: A Really Simple Solution to Archiving Blog Content. In: Sixth International Conference on Preservation of Digital Objects (iPRES 2009), California Digital Library, San Francisco, USA (October 2009)
-
Sixth International Conference on Preservation of Digital Objects (IPRES 2009), California Digital Library, San Francisco, USA (October 2009)
-
-
Pennock, M.1
Davis, R.2
-
17
-
-
84873560385
-
Textrunner: Open information extraction on the web
-
Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: Textrunner: Open information extraction on the web. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 25-26 (2007)
-
(2007)
Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics
, pp. 25-26
-
-
Yates, A.1
Cafarella, M.2
Banko, M.3
Etzioni, O.4
Broadhead, M.5
Soderland, S.6
|