-
2
-
-
34547475354
-
Siphoning hidden-web data through keyword-based interfaces
-
Barbosa, L., & Freire, J. (2004). Siphoning hidden-web data through keyword-based interfaces. In Proceedings of SBBD, 2004.
-
(2004)
Proceedings of SBBD, 2004
-
-
Barbosa, L.1
Freire, J.2
-
3
-
-
34250665891
-
Random sampling from a search engine's index
-
Bar-Yossef, Z., & Gurevich, M. (2006). Random sampling from a search engine's index. In Proceedings of WWW, 2006, pp, 367-376.
-
(2006)
Proceedings of WWW, 2006
, pp. 367-376
-
-
Bar-Yossef, Z.1
Gurevich, M.2
-
6
-
-
20444387298
-
A technique for measuring the relative size and overlap of public Web search engines
-
Bharat, K., & Broder, A. (1998). A technique for measuring the relative size and overlap of public Web search engines. In Proceedings of WWW, 1998, pp. 379-388.
-
(1998)
Proceedings of WWW, 1998
, pp. 379-388
-
-
Bharat, K.1
Broder, A.2
-
7
-
-
84949490837
-
Can we correctly estimate the total number of pages in Google for a specific language?
-
Bolshakov, I. A., & Galicia-Haro, S. N. (2003). Can we correctly estimate the total number of pages in Google for a specific language? CICLing 2003, pp. 415-419.
-
(2003)
CICLing
, vol.2003
, pp. 415-419
-
-
Bolshakov, I.A.1
Galicia-Haro, S.N.2
-
8
-
-
34547629212
-
Estimating corpus size via queries
-
Broder, A., Fontura, M., Josifovski, V., Kumar, R., Motwani, R., Nabar, S., etal. (2006). Estimating corpus size via queries. In CIKM'06, pp. 594-603.
-
(2006)
In CIKM'06
, pp. 594-603
-
-
Broder, A.1
Fontura, M.2
Josifovski, V.3
Kumar, R.4
Motwani, R.5
Nabar, S.6
Et al.7
-
9
-
-
0002104204
-
Query-based sampling of text databases
-
Callan, J., & Connell, M. (2001). Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2), 97-130.
-
(2001)
ACM Transactions On Information Systems
, vol.19
, Issue.2
, pp. 97-130
-
-
Callan, J.1
Connell, M.2
-
10
-
-
2442546444
-
Probe, cluster, and discover: Focused extraction of QA-pagelets from the deep web
-
Caverlee, J., Liu, L., & Buttler, D. (2004). Probe, cluster, and discover: Focused extraction of QA-pagelets from the deep web. In Proceedings of ICDE 2004, pp. 103-114.
-
(2004)
Proceedings of ICDE 2004
, pp. 103-114
-
-
Caverlee, J.1
Liu, L.2
Buttler, D.3
-
12
-
-
84944327150
-
RoadRunner: Towards automatic data extraction from large web sites
-
Crescenzi, V., Mecca, G., & Merialdo, P. (2001). RoadRunner: Towards automatic data extraction from large web sites. In Proceedings of VLDB 2001, pp. 109-118.
-
(2001)
Proceedings of VLDB 2001
, pp. 109-118
-
-
Crescenzi, V.1
Mecca, G.2
Merialdo, P.3
-
13
-
-
0000774305
-
The multiple-recapture census: I. Estimation of a closed population
-
Darroch, J. N. (1958). The multiple-recapture census: I. Estimation of a closed population. Biometrika, 45(3/4), 343-359.
-
(1958)
Biometrika
, vol.45
, Issue.3-4
, pp. 343-359
-
-
Darroch, J.N.1
-
14
-
-
12244289051
-
How large is the World Wide Web?
-
Springer
-
Dobra, A., & Fienberg, S. (2004). How large is the World Wide Web? Web Dynamics, Springer, pp. 23-44.
-
(2004)
Web Dynamics
, pp. 23-44
-
-
Dobra, A.1
Fienberg, S.2
-
15
-
-
77953071782
-
The indexable web is more than 11.5 billion pages
-
Gulli, A., & Signorini A. (2005). The indexable web is more than 11.5 billion pages. In Proceedings of WWW 2005, pp. 902-903.
-
(2005)
Proceedings of WWW, 2005
, pp. 902-903
-
-
Gulli, A.1
Signorini, A.2
-
16
-
-
0002812751
-
Sampling-based estimation of the number of distinct values of an attribute
-
Haas, P. J., Naughton, J. F., Seshadri, S., & Stokes, L. (1995). Sampling-based estimation of the number of distinct values of an attribute. In Proceedings of VLDB 1995, pp. 311-322.
-
(1995)
Proceedings of VLDB 1995
, pp. 311-322
-
-
Haas, P.J.1
Naughton, J.F.2
Seshadri, S.3
Stokes, L.4
-
18
-
-
0018442360
-
A unified approach to limit theorems for urn models
-
Holst, L. (1979). A unified approach to limit theorems for urn models. Journal of Applied Probability, 16(1), 154-162.
-
(1979)
Journal of Applied Probability
, vol.16
, Issue.1
, pp. 154-162
-
-
Holst, L.1
-
19
-
-
0034818750
-
Probe, count, and classify: Categorizing hidden web databases
-
Ipeirotis, P. G., Gravano, L., & Sahami, M. (2001). Probe, count, and classify: Categorizing hidden web databases. In Proceedings of SIGMOD'01.
-
(2001)
Proceedings of SIGMOD'01
-
-
Ipeirotis, P.G.1
Gravano, L.2
Sahami, M.3
-
20
-
-
0002781191
-
Accurately and reliably extracting data from the web: A machine learning approach
-
Knoblock, C. A., Lerman, K., Minton, S., & Muslea, I. (2000). Accurately and reliably extracting data from the web: A machine learning approach. IEEE Data Engineering Bulletin, 23(4), 33-41.
-
(2000)
IEEE Data Engineering Bulletin
, vol.23
, Issue.4
, pp. 33-41
-
-
Knoblock, C.A.1
Lerman, K.2
Minton, S.3
Muslea, I.4
-
22
-
-
62949086234
-
-
Liddle, S. W., Embley, D. W., Scott, D. T., & Yau, S. H. (2002). Extracting data behind web forms, advanced conceptual modeling techniques, pp. 402-413.
-
(2002)
Extracting Data Behind Web Forms, Advanced Conceptual Modeling Techniques
, pp. 402-413
-
-
Liddle, S.W.1
Embley, D.W.2
Scott, D.T.3
Yau, S.H.4
-
23
-
-
0037818401
-
Discovering the representative of a search engine
-
Liu, K., Yu, C., & Meng, W. (2002). Discovering the representative of a search engine. In Proceedings of CIKM'02, pp. 652-654.
-
(2002)
Proceedings of CIKM'02
, pp. 652-654
-
-
Liu, K.1
Yu, C.2
Meng, W.3
-
24
-
-
70349243672
-
Efficient estimation of the size of text deep web data source
-
Lu, J. (2008). Efficient estimation of the size of text deep web data source. In Proceedings of CIKM 2008, pp. 1485-1486.
-
(2008)
Proceedings of CIKM 2008
, pp. 1485-1486
-
-
Lu, J.1
-
25
-
-
62949239222
-
An approach to deep web crawling by sampling
-
Lu, J., Wang, Y., Liang, J., Chen, J., & Liu, J. (2008). An approach to deep web crawling by sampling. In Proceedings of Web Intelligence, pp. 718-724.
-
(2008)
Proceedings of Web Intelligence
, pp. 718-724
-
-
Lu, J.1
Wang, Y.2
Liang, J.3
Chen, J.4
Liu, J.5
-
26
-
-
34547614446
-
Efficient, automatic web resource harvesting
-
Nelson, M. L., Smith, J. A., & del Campo, I. G. (2006). Efficient, automatic web resource harvesting. In Proceedings of WIDM'06, pp. 43-50.
-
(2006)
Proceedings of WIDM'06
, pp. 43-50
-
-
Nelson, M.L.1
Smith, J.A.2
del Campo, I.G.3
-
27
-
-
27544458897
-
Downloading textual hidden web content through keyword queries
-
Ntoulas, A., Zerfos, P., & Cho, J. (2005). Downloading textual hidden web content through keyword queries. In Proceedings of JCDL, 2005, pp. 100-109.
-
(2005)
Proceedings of JCDL, 2005
, pp. 100-109
-
-
Ntoulas, A.1
Zerfos, P.2
Cho, J.3
-
28
-
-
0025659654
-
Statistical inference for capture crecapture experiments. The Wildlife Society
-
Pollock, K. H., Nichols, J. D., Brownie, C., & Hines, J. E. (1990). Statistical inference for capture crecapture experiments. The Wildlife Society. Wildlife Monographs, 107, 3-97.
-
(1990)
Wildlife Monographs
, vol.107
, pp. 3-97
-
-
Pollock, K.H.1
Nichols, J.D.2
Brownie, C.3
Hines, J.E.4
-
31
-
-
10444223864
-
DEQUE: Querying the deep web
-
Shestakov, D., Bhowmick, S. S. & Lim, E.-P. (2005). DEQUE: Querying the deep web. Journal of Data & Knowledge Engineering, 52(3), 273-311.
-
(2005)
Journal of Data & Knowledge Engineering
, vol.52
, Issue.3
, pp. 273-311
-
-
Shestakov, D.1
Bhowmick, S.S.2
Lim, E.-P.3
-
32
-
-
33750285514
-
SMM Tahaghoghi, capturing collection size for distributed non-cooperative retrieval
-
Shokouhi, M., Zobel, J., & Scholer, F. (2006). SMM Tahaghoghi, capturing collection size for distributed non-cooperative retrieval. In Proceedings of SIGIR'06, pp. 316-323.
-
(2006)
Proceedings of SIGIR'06
, pp. 316-323
-
-
Shokouhi, M.1
Zobel, J.2
Scholer, F.3
-
33
-
-
1542347745
-
Relevant document distribution estimation method for resource selection
-
Si, L., & Callan, J. (2003). Relevant document distribution estimation method for resource selection. In Proceedings of SIGIR'03.
-
(2003)
Proceedings of SIGIR'03
-
-
Si, L.1
Callan, J.2
-
34
-
-
36448993278
-
Evaluating sampling methods for uncooperative collections
-
Thomas, P., & Hawking, D. (2007). Evaluating sampling methods for uncooperative collections. In Proceedings of SIGIR, 2007.
-
(2007)
Proceedings of SIGIR, 2007
-
-
Thomas, P.1
Hawking, D.2
-
35
-
-
35248897586
-
-
25th European conference on IR research
-
Wu, S., Gibb, F., & Crestani, F. (2003). Experiments with document archive size detection. 25th European conference on IR research, pp. 294-304.
-
(2003)
Experiments With Document Archive Size Detection
, pp. 294-304
-
-
Wu, S.1
Gibb, F.2
Crestani, F.3
-
36
-
-
33749617417
-
Query selection techniques for efficient crawling of structured web sources
-
Wu, P., Wen, J.-R., Liu, H., & Ma, W.-Y. (2006). Query selection techniques for efficient crawling of structured web sources. In Proceedings of ICDE, 2006, pp. 47-56.
-
(2006)
Proceedings of ICDE, 2006
, pp. 47-56
-
-
Wu, P.1
Wen, J.-R.2
Liu, H.3
Ma, W.-Y.4
-
37
-
-
36448957566
-
Estimating collection size with logistic regression
-
Xu, J., Wu, S., & Li, X. (2007). Estimating collection size with logistic regression. In Proceedings of SIGIR'07, pp. 789-790.
-
(2007)
Proceedings of SIGIR'07
, pp. 789-790
-
-
Xu, J.1
Wu, S.2
Li, X.3
|