SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Information Retrieval

Volumn 13, Issue 1, 2010, Pages 70-95

Estimating deep web data source size by capture-recapture method

(2) Lu, Jianguo a,b Li, Dingding a

a UNIVERSITY OF WINDSOR (Canada)

b NANJING UNIVERSITY (China)

Author keywords

Capture recapture; Deep web; Estimators

Indexed keywords

EID: 76349085418 PISSN: 13864564 EISSN: 15737659 Source Type: Journal
DOI: 10.1007/s10791-009-9107-y Document Type: Article

Times cited : (25)

References (37)

1
- 84884044458
- Princeton University Press
- Amstrup, S. C., McDonald, T. L., & Manly, B. F. J. (2005). Handbook of capture-recapture analysis. Princeton University Press.
- (2005) Handbook of Capture-Recapture Analysis
- Amstrup, S.C.¹ McDonald, T.L.² Manly, B.F.J.³

2
- 34547475354
- Siphoning hidden-web data through keyword-based interfaces
- Barbosa, L., & Freire, J. (2004). Siphoning hidden-web data through keyword-based interfaces. In Proceedings of SBBD, 2004.
- (2004) Proceedings of SBBD, 2004
- Barbosa, L.¹ Freire, J.²

3
- 34250665891
- Random sampling from a search engine's index
- Bar-Yossef, Z., & Gurevich, M. (2006). Random sampling from a search engine's index. In Proceedings of WWW, 2006, pp, 367-376.
- (2006) Proceedings of WWW, 2006 , pp. 367-376
- Bar-Yossef, Z.¹ Gurevich, M.²

4
- 35348840330
- Efficient search engine measurements
- Bar-Yossef, Z., & Gurevich, M. (2007). Efficient search engine measurements. In Proceedings of WWW, 2007, pp. 401-410.
- (2007) Proceedings of WWW, 2007 , pp. 401-410
- Bar-Yossef, Z.¹ Gurevich, M.²

5
- 0003259187
- The deep web: Surfacing hidden value
- Bergman, M. K. (2001). The deep web: Surfacing hidden value. The Journal of Electronic Publishing, 7(1).
- (2001) The Journal of Electronic Publishing , vol.7 , Issue.1
- Bergman, M.K.¹

6
- 20444387298
- A technique for measuring the relative size and overlap of public Web search engines
- Bharat, K., & Broder, A. (1998). A technique for measuring the relative size and overlap of public Web search engines. In Proceedings of WWW, 1998, pp. 379-388.
- (1998) Proceedings of WWW, 1998 , pp. 379-388
- Bharat, K.¹ Broder, A.²

7
- 84949490837
- Can we correctly estimate the total number of pages in Google for a specific language?
- Bolshakov, I. A., & Galicia-Haro, S. N. (2003). Can we correctly estimate the total number of pages in Google for a specific language? CICLing 2003, pp. 415-419.
- (2003) CICLing , vol.2003 , pp. 415-419
- Bolshakov, I.A.¹ Galicia-Haro, S.N.²

8
- 34547629212
- Estimating corpus size via queries
- Broder, A., Fontura, M., Josifovski, V., Kumar, R., Motwani, R., Nabar, S., etal. (2006). Estimating corpus size via queries. In CIKM'06, pp. 594-603.
- (2006) In CIKM'06 , pp. 594-603
- Broder, A.¹ Fontura, M.² Josifovski, V.³ Kumar, R.⁴ Motwani, R.⁵ Nabar, S.⁶ Et al.⁷

9
- 0002104204
- Query-based sampling of text databases
- Callan, J., & Connell, M. (2001). Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2), 97-130.
- (2001) ACM Transactions On Information Systems , vol.19 , Issue.2 , pp. 97-130
- Callan, J.¹ Connell, M.²

10
- 2442546444
- Probe, cluster, and discover: Focused extraction of QA-pagelets from the deep web
- Caverlee, J., Liu, L., & Buttler, D. (2004). Probe, cluster, and discover: Focused extraction of QA-pagelets from the deep web. In Proceedings of ICDE 2004, pp. 103-114.
- (2004) Proceedings of ICDE 2004 , pp. 103-114
- Caverlee, J.¹ Liu, L.² Buttler, D.³

11
- 84945495487
- Estimating the number of classes via sample coverage
- Chao, A., & Lee, S.-M. (1992). Estimating the number of classes via sample coverage. Journal of American Statistical Association, 87, 210-217.
- (1992) Journal of American Statistical Association , vol.87 , pp. 210-217
- Chao, A.¹ Lee, S.-M.²

12
- 84944327150
- RoadRunner: Towards automatic data extraction from large web sites
- Crescenzi, V., Mecca, G., & Merialdo, P. (2001). RoadRunner: Towards automatic data extraction from large web sites. In Proceedings of VLDB 2001, pp. 109-118.
- (2001) Proceedings of VLDB 2001 , pp. 109-118
- Crescenzi, V.¹ Mecca, G.² Merialdo, P.³

13
- 0000774305
- The multiple-recapture census: I. Estimation of a closed population
- Darroch, J. N. (1958). The multiple-recapture census: I. Estimation of a closed population. Biometrika, 45(3/4), 343-359.
- (1958) Biometrika , vol.45 , Issue.3-4 , pp. 343-359
- Darroch, J.N.¹

14
- 12244289051
- How large is the World Wide Web?
- Springer
- Dobra, A., & Fienberg, S. (2004). How large is the World Wide Web? Web Dynamics, Springer, pp. 23-44.
- (2004) Web Dynamics , pp. 23-44
- Dobra, A.¹ Fienberg, S.²

15
- 77953071782
- The indexable web is more than 11.5 billion pages
- Gulli, A., & Signorini A. (2005). The indexable web is more than 11.5 billion pages. In Proceedings of WWW 2005, pp. 902-903.
- (2005) Proceedings of WWW, 2005 , pp. 902-903
- Gulli, A.¹ Signorini, A.²

16
- 0002812751
- Sampling-based estimation of the number of distinct values of an attribute
- Haas, P. J., Naughton, J. F., Seshadri, S., & Stokes, L. (1995). Sampling-based estimation of the number of distinct values of an attribute. In Proceedings of VLDB 1995, pp. 311-322.
- (1995) Proceedings of VLDB 1995 , pp. 311-322
- Haas, P.J.¹ Naughton, J.F.² Seshadri, S.³ Stokes, L.⁴

17
- 25144453630
- Manning Publications
- Hatcher, E., & Gospodnetic, O. (2004). Lucene in action. Manning Publications.
- (2004) Lucene In Action
- Hatcher, E.¹ Gospodnetic, O.²

18
- 0018442360
- A unified approach to limit theorems for urn models
- Holst, L. (1979). A unified approach to limit theorems for urn models. Journal of Applied Probability, 16(1), 154-162.
- (1979) Journal of Applied Probability , vol.16 , Issue.1 , pp. 154-162
- Holst, L.¹

19
- 0034818750
- Probe, count, and classify: Categorizing hidden web databases
- Ipeirotis, P. G., Gravano, L., & Sahami, M. (2001). Probe, count, and classify: Categorizing hidden web databases. In Proceedings of SIGMOD'01.
- (2001) Proceedings of SIGMOD'01
- Ipeirotis, P.G.¹ Gravano, L.² Sahami, M.³

20
- 0002781191
- Accurately and reliably extracting data from the web: A machine learning approach
- Knoblock, C. A., Lerman, K., Minton, S., & Muslea, I. (2000). Accurately and reliably extracting data from the web: A machine learning approach. IEEE Data Engineering Bulletin, 23(4), 33-41.
- (2000) IEEE Data Engineering Bulletin , vol.23 , Issue.4 , pp. 33-41
- Knoblock, C.A.¹ Lerman, K.² Minton, S.³ Muslea, I.⁴

21
- 85142688646
- Newsweeder: Learning to filter netnews
- Lang, K. (1995). Newsweeder: Learning to filter netnews. In Twelfth international conference on machine learning, pp. 331-339.
- (1995) Twelfth International Conference On Machine Learning , pp. 331-339
- Lang, K.¹

22
- 62949086234
- Liddle, S. W., Embley, D. W., Scott, D. T., & Yau, S. H. (2002). Extracting data behind web forms, advanced conceptual modeling techniques, pp. 402-413.
- (2002) Extracting Data Behind Web Forms, Advanced Conceptual Modeling Techniques , pp. 402-413
- Liddle, S.W.¹ Embley, D.W.² Scott, D.T.³ Yau, S.H.⁴

23
- 0037818401
- Discovering the representative of a search engine
- Liu, K., Yu, C., & Meng, W. (2002). Discovering the representative of a search engine. In Proceedings of CIKM'02, pp. 652-654.
- (2002) Proceedings of CIKM'02 , pp. 652-654
- Liu, K.¹ Yu, C.² Meng, W.³

24
- 70349243672
- Efficient estimation of the size of text deep web data source
- Lu, J. (2008). Efficient estimation of the size of text deep web data source. In Proceedings of CIKM 2008, pp. 1485-1486.
- (2008) Proceedings of CIKM 2008 , pp. 1485-1486
- Lu, J.¹

25
- 62949239222
- An approach to deep web crawling by sampling
- Lu, J., Wang, Y., Liang, J., Chen, J., & Liu, J. (2008). An approach to deep web crawling by sampling. In Proceedings of Web Intelligence, pp. 718-724.
- (2008) Proceedings of Web Intelligence , pp. 718-724
- Lu, J.¹ Wang, Y.² Liang, J.³ Chen, J.⁴ Liu, J.⁵

26
- 34547614446
- Efficient, automatic web resource harvesting
- Nelson, M. L., Smith, J. A., & del Campo, I. G. (2006). Efficient, automatic web resource harvesting. In Proceedings of WIDM'06, pp. 43-50.
- (2006) Proceedings of WIDM'06 , pp. 43-50
- Nelson, M.L.¹ Smith, J.A.² del Campo, I.G.³

27
- 27544458897
- Downloading textual hidden web content through keyword queries
- Ntoulas, A., Zerfos, P., & Cho, J. (2005). Downloading textual hidden web content through keyword queries. In Proceedings of JCDL, 2005, pp. 100-109.
- (2005) Proceedings of JCDL, 2005 , pp. 100-109
- Ntoulas, A.¹ Zerfos, P.² Cho, J.³

28
- 0025659654
- Statistical inference for capture crecapture experiments. The Wildlife Society
- Pollock, K. H., Nichols, J. D., Brownie, C., & Hines, J. E. (1990). Statistical inference for capture crecapture experiments. The Wildlife Society. Wildlife Monographs, 107, 3-97.
- (1990) Wildlife Monographs , vol.107 , pp. 3-97
- Pollock, K.H.¹ Nichols, J.D.² Brownie, C.³ Hines, J.E.⁴

29
- 84944325093
- Crawling the hidden web
- Raghavan, S., & Garcia-Molina, H. (2001). Crawling the hidden web. Proceedings of VLDB 2001.
- (2001) Proceedings of VLDB 2001
- Raghavan, S.¹ Garcia-Molina, H.²

30
- 0001281671
- The estimation of fish populations in lakes or ponds
- Schumacher, F. X., & Eschmeyer, R. W. (1943). The estimation of fish populations in lakes or ponds. Journal. Tennessee Academy of Science, 18, 228-249.
- (1943) Journal. Tennessee Academy of Science , vol.18 , pp. 228-249
- Schumacher, F.X.¹ Eschmeyer, R.W.²

31
- 10444223864
- DEQUE: Querying the deep web
- Shestakov, D., Bhowmick, S. S. & Lim, E.-P. (2005). DEQUE: Querying the deep web. Journal of Data & Knowledge Engineering, 52(3), 273-311.
- (2005) Journal of Data & Knowledge Engineering , vol.52 , Issue.3 , pp. 273-311
- Shestakov, D.¹ Bhowmick, S.S.² Lim, E.-P.³

32
- 33750285514
- SMM Tahaghoghi, capturing collection size for distributed non-cooperative retrieval
- Shokouhi, M., Zobel, J., & Scholer, F. (2006). SMM Tahaghoghi, capturing collection size for distributed non-cooperative retrieval. In Proceedings of SIGIR'06, pp. 316-323.
- (2006) Proceedings of SIGIR'06 , pp. 316-323
- Shokouhi, M.¹ Zobel, J.² Scholer, F.³

33
- 1542347745
- Relevant document distribution estimation method for resource selection
- Si, L., & Callan, J. (2003). Relevant document distribution estimation method for resource selection. In Proceedings of SIGIR'03.
- (2003) Proceedings of SIGIR'03
- Si, L.¹ Callan, J.²

34
- 36448993278
- Evaluating sampling methods for uncooperative collections
- Thomas, P., & Hawking, D. (2007). Evaluating sampling methods for uncooperative collections. In Proceedings of SIGIR, 2007.
- (2007) Proceedings of SIGIR, 2007
- Thomas, P.¹ Hawking, D.²

35
- 35248897586
- 25th European conference on IR research
- Wu, S., Gibb, F., & Crestani, F. (2003). Experiments with document archive size detection. 25th European conference on IR research, pp. 294-304.
- (2003) Experiments With Document Archive Size Detection , pp. 294-304
- Wu, S.¹ Gibb, F.² Crestani, F.³

36
- 33749617417
- Query selection techniques for efficient crawling of structured web sources
- Wu, P., Wen, J.-R., Liu, H., & Ma, W.-Y. (2006). Query selection techniques for efficient crawling of structured web sources. In Proceedings of ICDE, 2006, pp. 47-56.
- (2006) Proceedings of ICDE, 2006 , pp. 47-56
- Wu, P.¹ Wen, J.-R.² Liu, H.³ Ma, W.-Y.⁴

37
- 36448957566
- Estimating collection size with logistic regression
- Xu, J., Wu, S., & Li, X. (2007). Estimating collection size with logistic regression. In Proceedings of SIGIR'07, pp. 789-790.
- (2007) Proceedings of SIGIR'07 , pp. 789-790
- Xu, J.¹ Wu, S.² Li, X.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.