SCOPUS 정보 검색 플랫폼

WWW'09 - Proceedings of the 18th International World Wide Web Conference

Volumn , Issue , 2009, Pages 81-90

Efficient overlap and content reuse detection in blogs and online news articles.

(3) Kim, Jong Wook a Candan, K Selçuk a Tatemura, Junichi b

a Arizona State University (United States)

b NEC LABORATORIES AMERICA (United States)

Author keywords

Reuse detection; Weblogs

Indexed keywords

BLOGOSPHERES; CONTENT RE-USE; DETECTION RATES; DYNAMIC NATURE; INCREMENTAL PROCESSING; INFORMATION SOURCES; MEDIA OUTLETS; MULTIPLE ORDERS; ONLINE NEWS; PROCESSING TIME; WEBLOGS;

INFORMATION DISSEMINATION; WORLD WIDE WEB;

BLOGS;

EID: 77955914175 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1526709.1526721 Document Type: Conference Paper

Times cited : (30)

References (40)

1
- 84865633903
- David Sifry's Blog. http://www.sifry.com/alerts/.

2
- 84865633905
- Google Blog Search. http://blogsearch.google.com/blogsearch.

3
- 84865646867
- Google News. http://news.google.com.

4
- 84865660488
- Google Book Search. http://books.google.com/.

5
- 84865660484
- Yahoo News. http://news.yahoo.com.

6
- 0003602325
- The MD5 Message-Digest Algorithm. http://tools.ietf.org/html/rfc1321.
- The MD5 Message-Digest Algorithm

7
- 84865647520
- WWW'06 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics. 2006.
- (2006) WWW'06 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics

8
- 33748871742
- Tracking information epidemics in blogspace
- E. Adar and L.A. Adamic. Tracking Information Epidemics in Blogspace. In Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, 2005.
- (2005) Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
- Adar, E.¹ Adamic, L.A.²

9
- 85104914015
- Efficient exact set-similarity joins
- A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, 2006.
- (2006) VLDB
- Arasu, A.¹ Ganti, V.² Kaushik, R.³

10
- 35348849154
- Scaling up all Pairs similarity search
- R.J. Bayardo, Y. Ma, and R. Srikant. Scaling Up All Pairs Similarity Search. In WWW, 2007.
- (2007) WWW
- Bayardo, R.J.¹ Ma, Y.² Srikant, R.³

11
- 0037870443
- The X-tree: An index structure for high-dimensional data
- S. Berchtold, D.A. Keim, and H. Kriegei. The X-tree: An Index Structure for High-Dimensional Data. In VLDB, 1996.
- (1996) VLDB
- Berchtold, S.¹ Keim, D.A.² Kriegei, H.³

12
- 33646126481
- A scalable system for identifying co-derivative documents
- Y. Bernstein and J. Zobel. A Scalable System for Identifying Co-derivative Documents. In Proceedings of String Processing and Information Retrieval Symp, 2004.
- (2004) Proceedings of String Processing and Information Retrieval Symp
- Bernstein, Y.¹ Zobel, J.²

13
- 1842431844
- Perseus Books Group
- R. Blood. The Weblog Handbook: Practical Advice on Creating and Maintaining Your Blog. Perseus Books Group, 2002.
- (2002) The Weblog Handbook: Practical Advice on Creating and Maintaining Your Blog
- Blood, R.¹

14
- 0031346696
- On the resemblance and containment of documents
- A.Z. Broder. On the resemblance and containment of documents. In Proceedings of Compression and Complexity of Sequences, 1997.
- (1997) Proceedings of Compression and Complexity of Sequences
- Broder, A.Z.¹

15
- 84976810280
- Copy detection mechanisms for digital documents
- S. Brin, J. Davis, and H. Garcia-Molina. Copy detection mechanisms for digital documents. In SIGMOD, 1995.
- (1995) SIGMOD
- Brin, S.¹ Davis, J.² Garcia-Molina, H.³

16
- 0032664793
- The hybrid tree: An index structure for high dimensional feature spaces
- K. Chakrabarti, and S. Mehrotra. The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces. In ICDE, 1999.
- (1999) ICDE
- Chakrabarti, K.¹ Mehrotra, S.²

17
- 33749597967
- A primitive operator for similarity joins in data cleaning
- S. Chaudhuri, V. Ganti, and R. Kaushik. A Primitive Operator for Similarity Joins in Data Cleaning. In ICDE, 2006.
- (2006) ICDE
- Chaudhuri, S.¹ Ganti, V.² Kaushik, R.³

18
- 3042606353
- Shared information and program plagiarism detection
- X. Chen, B. Francia, M. Li, and B. Mckinnon. Shared Information and Program Plagiarism Detection. IEEE Transactions on Information Theory, 50 (7), 1545-1551, 2004.
- (2004) IEEE Transactions on Information Theory , vol.50 , Issue.7 , pp. 1545-1551
- Chen, X.¹ Francia, B.² Li, M.³ McKinnon, B.⁴

19
- 36849049806
- Structural and temporal analysis of the blogosphere through community factorization
- Y. Chi, S. Zhu, X. Song, J. Tatemura, and B.L. Tseng. Structural and temporal analysis of the blogosphere through community factorization. In SIGKDD, 2007.
- (2007) SIGKDD
- Chi, Y.¹ Zhu, S.² Song, X.³ Tatemura, J.⁴ Tseng, B.L.⁵

20
- 17744364120
- Clustering by compression
- R. Cilibrasi, and P. Vitanyi. Clustering by compression. IEEE Transactions on Information Theory, 51(4), 1523-1545, 2005.
- (2005) IEEE Transactions on Information Theory , vol.51 , Issue.4 , pp. 1523-1545
- Cilibrasi, R.¹ Vitanyi, P.²

21
- 0013206133
- Collection statistics for fast duplicate document detection
- A. Chowdhury, O. Frieder, D. Grossman, M.C. McCabe. Collection statistics for fast duplicate document detection. ACM TOIS, v.20 n.2, p.171-191, 2002.
- (2002) ACM Tois , vol.20 , Issue.2 , pp. 171-191
- Chowdhury, A.¹ Frieder, O.² Grossman, D.³ McCabe, M.C.⁴

22
- 15044355327
- Similarity search in high dimensions via hashing
- A. Gionis, P. Indyk, and R. Motwani. Similarity Search in High Dimensions via Hashing. In VLDB, 1999.
- (1999) VLDB
- Gionis, A.¹ Indyk, P.² Motwani, R.³

23
- 84944318804
- Approximate string joins in a database (almost) for free
- L. Gravano, P.G. Ipeirotis, H.V. Jagadish, N.Koudas, S. Muthukrishnan, and D. Srivastava Approximate String Joins in a Database (Almost) for Free. In VLDB, 2001.
- (2001) VLDB
- Gravano, L.¹ Ipeirotis, P.G.² Jagadish, H.V.³ Koudas, N.⁴ Muthukrishnan, S.⁵ Srivastava, D.⁶

24
- 0013207911
- Scalable document fingerprinting
- N. Heintze. Scalable document fingerprinting. In USENIX Workshop on Electronic Commerce, 1996.
- (1996) USENIX Workshop on Electronic Commerce
- Heintze, N.¹

25
- 85026972772
- Probabilistic latent semantic analysis
- T. Hofmann. Probabilistic latent semantic analysis. In Proceedings of Uncertainty in Artificial Intelligence, 1999.
- (1999) Proceedings of Uncertainty in Artificial Intelligence
- Hofmann, T.¹

26
- 0030646261
- Locality-preserving hashing in multidimensional spaces
- P. Indyk, R. Motwani, P. Raghavan and S. Vempala Locality-preserving hashing in multidimensional spaces. In STOC, 1997.
- (1997) STOC
- Indyk, P.¹ Motwani, R.² Raghavan, P.³ Vempala, S.⁴

27
- 0031162081
- The SR-tree: An index structure for high-dimensional nearest neighbor queries
- N. Katayama and S. Satoh. The SR-tree: an index structure for high-dimensional nearest neighbor queries. In SIGMOD, 1997.
- (1997) SIGMOD
- Katayama, N.¹ Satoh, S.²

28
- 47749095961
- CDIP: Collection-driven, yet individuality-preserving automated blog tagging
- J.W. Kim, K.S. Candan, and J.Tatemura. CDIP: Collection-Driven, yet Individuality-Preserving Automated Blog Tagging. In ICSC, 2007.
- (2007) ICSC
- Kim, J.W.¹ Candan, K.S.² Tatemura, J.³

29
- 84865646869
- submitted
- J.W. Kim, K.S. Candan, and J.Tatemura. Organization and Tagging of Blog Entries based on Content Reuse. submitted.
- Organization and Tagging of Blog Entries Based on Content Reuse
- Kim, J.W.¹ Candan, K.S.² Tatemura, J.³

30
- 57349180452
- Generating links by mining quotations
- O. Kolak, and B.N. Schilit. Generating links by mining quotations. In HT, 2008.
- (2008) HT
- Kolak, O.¹ Schilit, B.N.²

31
- 85043988965
- Finding similar files in a large file system
- U. Manber. Finding Similar Files in a Large File System. In Proceedings of the USENIX Winter 1994 Technical Conference, 1994.
- (1994) Proceedings of the USENIX Winter 1994 Technical Conference
- Manber, U.¹

32
- 35348911985
- Detecting near duplicates for web crawling
- G.S. Manku, A. Jain and A.D.Sarma. Detecting Near Duplicates for Web Crawling. In WWW, 2007.
- (2007) WWW
- Manku, G.S.¹ Jain, A.² Sarma, A.D.³

33
- 33745797351
- Similarity measures for tracking information flow
- D. Metzler, Y. Bernstein, W.B. Croft, A. Moffat, and J. Zobel. Similarity Measures for Tracking Information Flow. In CIKM, 2005.
- (2005) CIKM
- Metzler, D.¹ Bernstein, Y.² Croft, W.B.³ Moffat, A.⁴ Zobel, J.⁵

34
- 1142267351
- Winnowing: Local algorithms for document fingerprinting
- S. Schleimer, D.S. Wilkerson, and A. Aiken. Winnowing: Local Algorithms for Document Fingerprinting. In SIGMOD, 2003.
- (2003) SIGMOD
- Schleimer, S.¹ Wilkerson, D.S.² Aiken, A.³

35
- 85088005959
- Efficient set joins on similarity predicates
- S. Sarawagi, and A. Kirpa. Efficient set joins on similarity predicates. In SIGMOD, 2004.
- (2004) SIGMOD
- Sarawagi, S.¹ Kirpa, A.²

36
- 0013273370
- SCAM: A copy detection mechanism for digital documents
- N. Shivakumar and H. Garcia-Molina. SCAM: A Copy Detection Mechanism for Digital Documents. Second Annual Conference on the Theory and Practice of Digital Libraries, 1995.
- (1995) Second Annual Conference on the Theory and Practice of Digital Libraries
- Shivakumar, N.¹ Garcia-Molina, H.²

37
- 0013454721
- Finding near-replicas of documents on the web
- N. Shrivakumar and H. Garcia-Molina Finding near-replicas of documents on the Web. In International Workshop on the World Wide Web and Databases, 1998.
- (1998) International Workshop on the World Wide Web and Databases
- Shrivakumar, N.¹ Garcia-Molina, H.²

38
- 33750311279
- Near-duplicate detection by instance-level constrained clustering
- H. Yang, and J. Callan Near-duplicate detection by instance-level constrained clustering. In SIGIR, 2006.
- (2006) SIGIR
- Yang, H.¹ Callan, J.²

39
- 0032268976
- Inverted files versus signature files for text indexing
- Dec.
- J. Zobel, A. Moffat, and K. Ramamohanarao. Inverted files versus signature files for text indexing. ACM Transactions on Database Systems(TODS), 23(4), 453-490, Dec. 1998.
- (1998) ACM Transactions on Database Systems(TODS) , vol.23 , Issue.4 , pp. 453-490
- Zobel, J.¹ Moffat, A.² Ramamohanarao, K.³

40
- 66249113620
- Efficient similarity joins for near duplicate detection
- C. Xiao, W. Wang, X. Lin, and J.X. Yu. Efficient Similarity Joins for Near Duplicate Detection. In WWW, 2008.
- (2008) WWW
- Xiao, C.¹ Wang, W.² Lin, X.³ Yu, J.X.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.