-
3
-
-
33745518684
-
Disambiguating web appearances of people in a social network
-
Bekkerman, R., and McCallum, A. Disambiguating Web Appearances of People in a Social Network. In Proc. of the WWW, 2005.
-
(2005)
Proc. of the WWW
-
-
Bekkerman, R.1
McCallum, A.2
-
4
-
-
29844458352
-
Iterative record linkage for cleaning and integration
-
Bhattacharya, I., and Getoor, L. Iterative record linkage for cleaning and integration. In DMKD, 2004.
-
(2004)
DMKD
-
-
Bhattacharya, I.1
Getoor, L.2
-
5
-
-
83055165536
-
Collective information extraction with relational Markov networks
-
Bunescu, R. C., and Mooney, R. J. Collective information extraction with relational Markov networks. In Proc. of ACL, 2004.
-
(2004)
Proc. of ACL
-
-
Bunescu, R.C.1
Mooney, R.J.2
-
6
-
-
0035000412
-
Fully automated object extraction system for the world wide web
-
Buttler, D., Liu, L., and Pu, C. A Fully Automated Object Extraction System for the World Wide Web. In Proc. of IEEE ICDCS, 2001.
-
(2001)
Proc. of IEEE ICDCS
-
-
Buttler, D.1
Liu, L.2
Pu, C.A.3
-
7
-
-
8644236286
-
VIPS: A visionbased page segmentation algorithm
-
Cai, D., Yu, S., Wen, J.-R. and Ma, W.-Y. VIPS: a Visionbased Page Segmentation Algorithm, Microsoft Technical Report, MSR-TR-2003-79, 2003.
-
(2003)
Microsoft Technical Report, MSR-TR-2003-79
-
-
Cai, D.1
Yu, S.2
Wen, J.-R.3
Ma, W.-Y.4
-
8
-
-
8644267730
-
Block-based web search
-
Cai, D., Yu, S., Wen, J. R., and Ma, W. Y. Block-based web search. In ACM SIGIR Conference, 2004.
-
(2004)
ACM SIGIR Conference
-
-
Cai, D.1
Yu, S.2
Wen, J.R.3
Ma, W.Y.4
-
10
-
-
2342568689
-
IEPAD: Information extraction based on pattern discovery
-
Chang, C.-H., and Liu, S.-L. IEPAD: Information Extraction Based on Pattern Discovery. In Proc. of WWW, 2001.
-
(2001)
Proc. of WWW
-
-
Chang, C.-H.1
Liu, S.-L.2
-
12
-
-
12244290581
-
Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods
-
Cohen, W. W., and Sarawagi, S. Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods. In Proc. of SIGKDD, 2004.
-
(2004)
Proc. of SIGKDD
-
-
Cohen, W.W.1
Sarawagi, S.2
-
14
-
-
84944327150
-
ROADRUNNER: Towards automatic data extraction from large web sites
-
Crescenzi, V., Mecca, G., and Merialdo, P. ROADRUNNER: Towards Automatic Data Extraction from Large Web Sites. In Proc. of VLDB, 2001.
-
(2001)
Proc. of VLDB
-
-
Crescenzi, V.1
Mecca, G.2
Merialdo, P.3
-
15
-
-
18744368587
-
Object matching for information integration: A profiler-based approach
-
Doan, A., Lu, Y., Lee, Y., and Han, J. Object matching for information integration: a profiler-based approach. In IIWeb, 2003.
-
(2003)
IIWeb
-
-
Doan, A.1
Lu, Y.2
Lee, Y.3
Han, J.4
-
17
-
-
29844452555
-
Reference econciliation in Complex Information Spaces
-
Dong, X., Halevy, A., and Madhavan, J.. Reference econciliation in Complex Information Spaces. In Proc. Of SIGMOD, 2005.
-
(2005)
Proc. of SIGMOD
-
-
Dong, X.1
Halevy, A.2
Madhavan, J.3
-
19
-
-
84880504449
-
Searching the workplace web
-
Fagin, R., Kumar, R., McCurley, K., Novak, J., Sivakumar, D., Tomlin, J., and Williamson, D. Searching the Workplace Web. In Proceedings of the Twelfth International World Wide Web Conference, 2003.
-
(2003)
Proceedings of the Twelfth International World Wide Web Conference
-
-
Fagin, R.1
Kumar, R.2
McCurley, K.3
Novak, J.4
Sivakumar, D.5
Tomlin, J.6
Williamson, D.7
-
20
-
-
0032119668
-
The hierarchical hidden Markov model: Analysis and applications
-
Fine, S., Singer Y., and Tishby, N. The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32:41-62, 1998.
-
(1998)
Machine Learning
, vol.32
, pp. 41-62
-
-
Fine, S.1
Singer, Y.2
Tishby, N.3
-
21
-
-
22944476409
-
Multi-level boundary classification for information extraction
-
Finn, A., and Kushmerick, N. Multi-level boundary classification for information extraction. In Proc. ECML, 2004.
-
(2004)
Proc. ECML
-
-
Finn, A.1
Kushmerick, N.2
-
22
-
-
19844380495
-
The Google file system
-
Ghemawat, S., Gobioff, H., and Leung, S.-T., The Google File System, In Proc. of SOSP, 2003.
-
(2003)
Proc. of SOSP
-
-
Ghemawat, S.1
Gobioff, H.2
Leung, S.-T.3
-
24
-
-
84880467474
-
Text Joins in an RDBMS for Web Data Integration
-
Gravano, L., Lpeirotism, P., Koudas, N., Srivastava, D. Text Joins in an RDBMS for Web Data Integration. In Proc. Of WWW, 2003.
-
(2003)
Proc. of WWW
-
-
Gravano, L.1
Lpeirotism, P.2
Koudas, N.3
Srivastava, D.4
-
25
-
-
1142279460
-
XRANK: Ranked keyword search over XML documents
-
Guo, L., Shao, F., Botev, C., and Shanmugasundaram, J. XRANK: Ranked keyword search over XML documents. In ACM SIGMOD, 2003.
-
(2003)
ACM SIGMOD
-
-
Guo, L.1
Shao, F.2
Botev, C.3
Shanmugasundaram, J.4
-
26
-
-
4944235920
-
Two supervised learning approaches for name disambiguation in author citations
-
Han, H., Giles, L., Zha, H., Li, C. and Tsioutsiouliklis, K. Two supervised learning approaches for name disambiguation in author citations. In JCDL 2004.
-
(2004)
JCDL
-
-
Han, H.1
Giles, L.2
Zha, H.3
Li, C.4
Tsioutsiouliklis, K.5
-
30
-
-
0034172374
-
Wrapper induction: Efficiency and expressiveness
-
Kushmerick, N. Wrapper induction: efficiency and expressiveness. Artificial Intelligence, 118:15-68, 2000.
-
(2000)
Artificial Intelligence
, vol.118
, pp. 15-68
-
-
Kushmerick, N.1
-
31
-
-
0142192295
-
Conditional random fields: Probabilistic models for segmenting and labelling sequence data
-
Lafferty, J., McCallum, A., and Pereira, F. Conditional random fields: Probabilistic models for segmenting and labelling sequence data. In Proc. of ICML, 2001.
-
(2001)
Proc. of ICML
-
-
Lafferty, J.1
McCallum, A.2
Pereira, F.3
-
32
-
-
0030656747
-
Dempster-Shafer's theory of evidence applied to structured documents: Modeling uncertainty
-
Lalmas, M. Dempster-Shafer's Theory of Evidence Applied to Structured Documents: Modeling Uncertainty. In Proceedings of SIGIR, 1997.
-
(1997)
Proceedings of SIGIR
-
-
Lalmas, M.1
-
33
-
-
2142760102
-
Cleaning the spurious links in data
-
Mar-Apr.
-
Lee, M., Hsu, W., and Kothari, V. Cleaning the spurious links in data. IEEE Intelligent Systems, Mar-Apr, 2004.
-
(2004)
IEEE Intelligent Systems
-
-
Lee, M.1
Hsu, W.2
Kothari, V.3
-
34
-
-
3142742483
-
Using the structure of web sites for automatic segmentation of tables
-
Lerman, K., Getoor, L., Minton, S., and Knoblock, C. Using the Structure of Web Sites for Automatic Segmentation of Tables. In Proc. of ACM SIGMOD, 2004.
-
(2004)
Proc. of ACM SIGMOD
-
-
Lerman, K.1
Getoor, L.2
Minton, S.3
Knoblock, C.4
-
37
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching
-
McCallum, A., Nigam, K., and Ungar, L. Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching. In Proc. Of SIGKDD, 2000.
-
(2000)
Proc. of SIGKDD
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.3
-
38
-
-
0032656760
-
Estimating the usefulness of search engines
-
Meng, M., Liu, K., Yu, C., Wu, W., and Rishe, N. Estimating the usefulness of search engines. In ICDE Conference, 1999.
-
(1999)
ICDE Conference
-
-
Meng, M.1
Liu, K.2
Yu, C.3
Wu, W.4
Rishe, N.5
-
39
-
-
33645967226
-
Exploiting secondary sources for unsupervised record linkage
-
Michalowski, M., Thakkar, S., and Knoblock, C. Exploiting secondary sources for unsupervised record linkage. In IIWeb, 2004.
-
(2004)
IIWeb
-
-
Michalowski, M.1
Thakkar, S.2
Knoblock, C.3
-
40
-
-
0035587215
-
Hierarchical wrapper induction for semi-structured information sources
-
2001
-
Muslea, I., Minton, S., and Knoblock, C. A. Hierarchical Wrapper Induction for Semi-structured Information Sources. Autonomous Agents and Multi-Agent 4, 1/2 (2001), 2001.
-
(2001)
Autonomous Agents and Multi-Agent
, vol.4
, Issue.1-2
-
-
Muslea, I.1
Minton, S.2
Knoblock, C.A.3
-
41
-
-
0001868006
-
A mutually beneficial integration of data mining and information extraction
-
Nahm, U. Y., and Mooney, R. J. A Mutually Beneficial Integration of Data Mining and Information Extraction. In Proc. of AAAI, 2001.
-
(2001)
Proc. of AAAI
-
-
Nahm, U.Y.1
Mooney, R.J.2
-
42
-
-
33749546453
-
Object-level Ranking: Bringing Order to web Objects
-
Nie, Z., Zhang, Y., Wen, J.-R., and Ma, W.-Y. Object-level Ranking: Bringing Order to web Objects. In Proc. WWW, 2005.
-
(2005)
Proc. WWW
-
-
Nie, Z.1
Zhang, Y.2
Wen, J.-R.3
Ma, W.-Y.4
-
43
-
-
33749618104
-
Extracting objects from the web
-
Nie, Z., Wu, F., Wen, J.-R., and Ma, W.-Y. Extracting Objects from the Web. In Proc. of ICDE. 2006.
-
(2006)
Proc. of ICDE
-
-
Nie, Z.1
Wu, F.2
Wen, J.-R.3
Ma, W.-Y.4
-
44
-
-
1542287497
-
Combining document representations for known item search
-
Ogilvie, P., and Callan, J. Combining Document Representations for known item search. In Proceedings of SIGIR, 2003.
-
(2003)
Proceedings of SIGIR
-
-
Ogilvie, P.1
Callan, J.2
-
45
-
-
34247276592
-
An effective approach to entity resolution problem using quasiclique and its application to digital libraries
-
On, B., Elmacioglu, E., Lee, D., Kang, J., and Pei, J. An Effective Approach to Entity Resolution Problem Using QuasiClique and its Application to Digital Libraries. In JCDL 2006.
-
(2006)
JCDL
-
-
On, B.1
Elmacioglu, E.2
Lee, D.3
Kang, J.4
Pei, J.5
-
46
-
-
0003780986
-
The pagerank citation ranking: Bringing order to the web
-
Page, L., Brin, S., Motwani, R., Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford Digital Library Technologies Project, 1998.
-
(1998)
Technical Report, Stanford Digital Library Technologies Project
-
-
Page, L.1
Brin, S.2
Motwani, R.3
Winograd, T.4
-
48
-
-
34047192804
-
Semi-markov conditional random fields for information extraction
-
Sarawagi, S., and Cohen, W. W. Semi-Markov Conditional Random Fields for Information Extraction. In Proc. of NIPS, 2004.
-
(2004)
Proc. of NIPS
-
-
Sarawagi, S.1
Cohen, W.W.2
-
49
-
-
84880805498
-
Hierarchical hidden markov models for information extraction
-
Skounakis, M., Craven, M., and Ray S. Hierarchical Hidden Markov Models for Information Extraction. In Proc. of IJCAI, 2003.
-
(2003)
Proc. of IJCAI
-
-
Skounakis, M.1
Craven, M.2
Ray, S.3
-
50
-
-
18744381159
-
Learning block importance models for webpages
-
Song, R., Liu, H., Wen, J. R., and Ma, W. Y. Learning Block Importance Models for Webpages. In Proc. of WWW, 2004.
-
(2004)
Proc. of WWW
-
-
Song, R.1
Liu, H.2
Wen, J.R.3
Ma, W.Y.4
-
51
-
-
33745618477
-
C-Store: A column oriented DBMS
-
Stonebraker, M., et al., C-Store: A Column Oriented DBMS. In Proc. of VLDB, pages 553-564, 2005.
-
(2005)
Proc. of VLDB
, pp. 553-564
-
-
Stonebraker, M.1
-
52
-
-
14344253846
-
Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data
-
Sutton, C., Rohanimanesh, K., and McCallum, A. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. In Proc. ICML, 2004.
-
(2004)
Proc. ICML
-
-
Sutton, C.1
Rohanimanesh, K.2
McCallum, A.3
-
53
-
-
29244441994
-
An integrated, conditional model of information extraction and coreference with application to citation matching
-
Wellner, B., McCallum, A., Peng, F., and Hay, M. An Integrated, Conditional Model of Information Extraction and Coreference with Application to Citation Matching. In Proc. of UAI, 2004.
-
(2004)
Proc. of UAI
-
-
Wellner, B.1
McCallum, A.2
Peng, F.3
Hay, M.4
-
54
-
-
0042504728
-
Retrieving webpages using content, links, urls and anchors
-
Westerveld, T., Kraaij, W., and Hiemstra, D. Retrieving Webpages using Content, Links, URLs and Anchors. In The Tenth Text REtrieval Conference (TREC2001), 2001.
-
(2001)
The Tenth Text REtrieval Conference (TREC2001)
-
-
Westerveld, T.1
Kraaij, W.2
Hiemstra, D.3
-
55
-
-
84974660374
-
Effective retrieval of structured documents
-
Wilkinson, R. Effective Retrieval of Structured Documents. In Proceedings of SIGIR, 1994.
-
(1994)
Proceedings of SIGIR
-
-
Wilkinson, R.1
-
56
-
-
18744372602
-
Link Fusion: A unified link analysis framework for multitype inter-related data objects
-
Xi, W., Zhang, B., Chen, Z., Lu, Y., Yan, S., Ma, W.-Y. Link Fusion: A Unified Link Analysis Framework for Multitype Inter-related Data Objects. In Proc. of WWW 2004.
-
(2004)
Proc. of WWW
-
-
Xi, W.1
Zhang, B.2
Chen, Z.3
Lu, Y.4
Yan, S.5
Ma, W.-Y.6
-
57
-
-
0032275565
-
Effective retrieval with distributed collections
-
Xu, J., and Callan, J. Effective retrieval with distributed collections. In Proceedings of SIGIR, 1998.
-
(1998)
Proceedings of SIGIR
-
-
Xu, J.1
Callan, J.2
-
58
-
-
84858649428
-
A hierarchical markov random field model for figure-ground segregation
-
September
-
Yu, S. X., Lee, T., and Kanade, T. A Hierarchical Markov Random Field Model for Figure-Ground Segregation. Third International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, September, 2001.
-
(2001)
Third International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition
-
-
Yu, S.X.1
Lee, T.2
Kanade, T.3
-
59
-
-
33744821948
-
Web data extraction based on partial tree alignment
-
Zhai, Y., and Liu, B. Web Data Extraction Based on Partial Tree Alignment. In Proc. of WWW, 2005.
-
(2005)
Proc. of WWW
-
-
Zhai, Y.1
Liu, B.2
-
60
-
-
33744899132
-
Fully automatic wrapper generation for search engines
-
Zhao, H., Meng, W., Wu, Z., Raghavan, V., and Yu, C. Fully Automatic Wrapper Generation for Search Engines. In Proc. of WWW, 2005.
-
(2005)
Proc. of WWW
-
-
Zhao, H.1
Meng, W.2
Wu, Z.3
Raghavan, V.4
Yu, C.5
-
61
-
-
31844452562
-
2D conditional random fields for web information extraction
-
Zhu, J., Nie, Z., Wen, J.-R., Zhang, B., and Ma, W.-Y. 2D Conditional Random Fields for web Information Extraction. In Proc. of ICML, 2005.
-
(2005)
Proc. of ICML
-
-
Zhu, J.1
Nie, Z.2
Wen, J.-R.3
Zhang, B.4
Ma, W.-Y.5
-
62
-
-
33749623896
-
Simultaneous record detection and attribute labeling in web data extraction
-
Zhu, J., Nie, Z., Wen, J.-R., Zhang, B., and Ma, W.-Y. Simultaneous Record Detection and Attribute Labeling in web Data Extraction. In Proc. of SIGKDD, 2006.
-
(2006)
Proc. of SIGKDD
-
-
Zhu, J.1
Nie, Z.2
Wen, J.-R.3
Zhang, B.4
Ma, W.-Y.5
|