메뉴 건너뛰기




Volumn , Issue , 2013, Pages 371-380

Bottom-k and priority sampling, set similarity and subset sums with minimal independence

Author keywords

Estimation; Independence; Sampling

Indexed keywords

CONCENTRATION BOUNDS; CONSTANT FACTORS; INDEPENDENCE; PRIORITY SAMPLINGS; PROBABILITY ERRORS; RANDOM HASHING; RELATIVE ERRORS; SET SIMILARITY;

EID: 84879835309     PISSN: 07378017     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2488608.2488655     Document Type: Conference Paper
Times cited : (27)

References (36)
  • 2
    • 70350625168 scopus 로고    scopus 로고
    • Sketching algorithms for approximating rank correlations in collaborative filtering systems
    • Y. Bachrach, R. Herbrich, and E. Porat. Sketching algorithms for approximating rank correlations in collaborative filtering systems. In Proc. 16th SPIRE, pages 344-352, 2009.
    • (2009) Proc. 16th SPIRE , pp. 344-352
    • Bachrach, Y.1    Herbrich, R.2    Porat, E.3
  • 4
    • 77957061059 scopus 로고    scopus 로고
    • Sketching techniques for collaborative filtering
    • Y. Bachrach, E. Porat, and J. S. Rosenschein. Sketching techniques for collaborative filtering. In Proc. 21st IJCAI, pages 2016-2021, 2009.
    • (2009) Proc. 21st IJCAI , pp. 2016-2021
    • Bachrach, Y.1    Porat, E.2    Rosenschein, J.S.3
  • 7
    • 79956075292 scopus 로고    scopus 로고
    • Identifying and filtering near-duplicate documents
    • A. Z. Broder. Identifying and filtering near-duplicate documents. In Proc. 11th CPM, pages 1-10, 2000.
    • (2000) Proc. 11th CPM , pp. 1-10
    • Broder, A.Z.1
  • 11
    • 81255191687 scopus 로고    scopus 로고
    • Efficient stream sampling for variance-optimal estimation of subset sums
    • Announced at SODA'09
    • E. Cohen, N. Duffield, H. Kaplan, C. Lund, and M. Thorup. Efficient stream sampling for variance-optimal estimation of subset sums. SIAM Journal on Computing, 40(5):1402-1431, 2011. Announced at SODA'09.
    • (2011) SIAM Journal on Computing , vol.40 , Issue.5 , pp. 1402-1431
    • Cohen, E.1    Duffield, N.2    Kaplan, H.3    Lund, C.4    Thorup, M.5
  • 12
    • 36849001315 scopus 로고    scopus 로고
    • Summarizing data using bottom-k sketches
    • E. Cohen and H. Kaplan. Summarizing data using bottom-k sketches. In Proc. 26th PODC, pages 225-234, 2007.
    • (2007) Proc. 26th PODC , pp. 225-234
    • Cohen, E.1    Kaplan, H.2
  • 13
    • 84938057127 scopus 로고    scopus 로고
    • Estimating rarity and similarity over data stream windows
    • M. Datar and S. Muthukrishnan. Estimating rarity and similarity over data stream windows. In Proc. 10th ESA, pages 323-334, 2002.
    • (2002) Proc. 10th ESA , pp. 323-334
    • Datar, M.1    Muthukrishnan, S.2
  • 14
    • 0000467036 scopus 로고    scopus 로고
    • Universal hashing and k-wise independent random variables via integer arithmetic without primes
    • M. Dietzfelbinger. Universal hashing and k-wise independent random variables via integer arithmetic without primes. In Proc. 13th STACS, pages 569-580, 1996.
    • (1996) Proc. 13th STACS , pp. 569-580
    • Dietzfelbinger, M.1
  • 16
    • 18544383616 scopus 로고    scopus 로고
    • Learn more, sample less: Control of volume and variance in network measurements
    • N. Duffield, C. Lund, and M. Thorup. Learn more, sample less: control of volume and variance in network measurements. IEEE Transactions on Information Theory, 51(5):1756-1775, 2005.
    • (2005) IEEE Transactions on Information Theory , vol.51 , Issue.5 , pp. 1756-1775
    • Duffield, N.1    Lund, C.2    Thorup, M.3
  • 17
    • 37049036831 scopus 로고    scopus 로고
    • Priority sampling for estimation of arbitrary subset sums
    • Article 32, Announced at SIGMETRICS'04
    • N. Duffield, C. Lund, and M. Thorup. Priority sampling for estimation of arbitrary subset sums. J. ACM, 54(6):Article 32, 2007. Announced at SIGMETRICS'04.
    • (2007) J. ACM , vol.54 , Issue.6
    • Duffield, N.1    Lund, C.2    Thorup, M.3
  • 20
    • 33750296887 scopus 로고    scopus 로고
    • Finding near-duplicate web pages: A large-scale evaluation of algorithms
    • M. R. Henzinger. Finding near-duplicate web pages: a large-scale evaluation of algorithms. In Proc. 29th SIGIR, pages 284-291, 2006.
    • (2006) Proc. 29th SIGIR , pp. 284-291
    • Henzinger, M.R.1
  • 21
    • 35348911985 scopus 로고    scopus 로고
    • Detecting near-duplicates for web crawling
    • G. S. Manku, A. Jain, and A. D. Sarma. Detecting near-duplicates for web crawling. In Proc. 16th WWW, pages 141-150, 2007.
    • (2007) Proc. 16th WWW , pp. 141-150
    • Manku, G.S.1    Jain, A.2    Sarma, A.D.3
  • 25
    • 77955323479 scopus 로고    scopus 로고
    • Linear probing with constant independence
    • See also STOC'07
    • A. Pagh, R. Pagh, and M. Ružić. Linear probing with constant independence. SIAM Journal on Computing, 39(3):1107-1120, 2009. See also STOC'07.
    • (2009) SIAM Journal on Computing , vol.39 , Issue.3 , pp. 1107-1120
    • Pagh, A.1    Pagh, R.2    Ružić, M.3
  • 26
    • 77955322325 scopus 로고    scopus 로고
    • On the k-independence required by linear probing and minwise independence
    • Part I, LNCS 6198
    • M. Pǎtraşcu and M. Thorup. On the k-independence required by linear probing and minwise independence. In Proc. 36th ICALP, Part I, LNCS 6198, pages 715-726, 2010.
    • (2010) Proc. 36th ICALP , pp. 715-726
    • Pǎtraşcu, M.1    Thorup, M.2
  • 29
    • 1142267351 scopus 로고    scopus 로고
    • Winnowing: Local algorithms for document fingerprinting
    • S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In Proc. SIGMOD, pages 76-85, 2003.
    • (2003) Proc. SIGMOD , pp. 76-85
    • Schleimer, S.1    Wilkerson, D.S.2    Aiken, A.3
  • 30
    • 0001595705 scopus 로고
    • Chernoff-Hoeffding bounds for applications with limited independence
    • See also SODA'93
    • J. P. Schmidt, A. Siegel, and A. Srinivasan. Chernoff-Hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics, 8(2):223-250, 1995. See also SODA'93.
    • (1995) SIAM Journal on Discrete Mathematics , vol.8 , Issue.2 , pp. 223-250
    • Schmidt, J.P.1    Siegel, A.2    Srinivasan, A.3
  • 31
    • 33748098958 scopus 로고    scopus 로고
    • The DLT priority sampling is essentially optimal
    • M. Szegedy. The DLT priority sampling is essentially optimal. In Proc. 38th STOC, pages 150-158, 2006.
    • (2006) Proc. 38th STOC , pp. 150-158
    • Szegedy, M.1
  • 32
    • 33750300127 scopus 로고    scopus 로고
    • Confidence intervals for priority sampling
    • M. Thorup. Confidence intervals for priority sampling. In Proc. SIGMETRICS, pages 252-263, 2006.
    • (2006) Proc. SIGMETRICS , pp. 252-263
    • Thorup, M.1
  • 33
    • 84879835253 scopus 로고    scopus 로고
    • Bottom-k and priority sampling, set similarity and subset sums with minimal independence
    • M. Thorup. Bottom-k and priority sampling, set similarity and subset sums with minimal independence. CoRR, 2013.
    • (2013) CoRR
    • Thorup, M.1
  • 34
    • 84861633258 scopus 로고    scopus 로고
    • Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation
    • Announced at SODA'04 and ALENEX'10
    • M. Thorup and Y. Zhang. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM Journal on Computing, 41(2):293-331, 2012. Announced at SODA'04 and ALENEX'10.
    • (2012) SIAM Journal on Computing , vol.41 , Issue.2 , pp. 293-331
    • Thorup, M.1    Zhang, Y.2
  • 35
    • 0019572642 scopus 로고
    • New classes and applications of hash functions
    • See also FOCS'79
    • M. N. Wegman and L. Carter. New classes and applications of hash functions. Journal of Computer and System Sciences, 22(3):265-279, 1981. See also FOCS'79.
    • (1981) Journal of Computer and System Sciences , vol.22 , Issue.3 , pp. 265-279
    • Wegman, M.N.1    Carter, L.2
  • 36
    • 33750311279 scopus 로고    scopus 로고
    • Near-duplicate detection by instance-level constrained clustering
    • H. Yang and J. P. Callan. Near-duplicate detection by instance-level constrained clustering. In Proc. 29th SIGIR, pages 421-428, 2006. 380.
    • (2006) Proc. 29th SIGIR , vol.380 , pp. 421-428
    • Yang, H.1    Callan, J.P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.