SCOPUS 정보 검색 플랫폼

Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM'09

Volumn , Issue , 2009, Pages 262-271

Finding text reuse on the web

(2) Bendersky, Michael a Croft, W Bruce a

a Biologically Inspired Neural and Dynamical Systems Laboratory (United States)

Author keywords

Information flow; Text reuse; Web search

Indexed keywords

DETECTION TECHNIQUE; INFORMATION FLOW; INFORMATION FLOWS; LINK ANALYSIS; NOVEL TECHNIQUES; TEXT REUSE; TREC COLLECTION; WEB SEARCH; WEB SEARCHES;

INFORMATION RETRIEVAL; INFORMATION USE;

WORLD WIDE WEB;

EID: 70349155038 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1498759.1498835 Document Type: Conference Paper

Times cited : (72)

References (30)

1
- 33748871742
- Tracking Information Epidemics in Blogspace
- E. Adar and L. Adamic. Tracking Information Epidemics in Blogspace. In Proceedings of WI, pages 207-214, 2005.
- (2005) Proceedings of WI , pp. 207-214
- Adar, E.¹ Adamic, L.²

2
- 42949138243
- Finding high-quality content in social media
- E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In Proceedings of WSDM, pages 183-194, 2008.
- (2008) In Proceedings of WSDM , pp. 183-194
- Agichtein, E.¹ Castillo, C.² Donato, D.³ Gionis, A.⁴ Mishne, G.⁵

3
- 0034790170
- Temporal summaries of news topics
- J. Allan, R. Gupta, and V. Khandelwal. Temporal summaries of news topics. In Proceedings of SIGIR, pages 10-18, 2001.
- (2001) Proceedings of SIGIR , pp. 10-18
- Allan, J.¹ Gupta, R.² Khandelwal, V.³

4
- 57349156145
- Genealogical trees on the web: A search engine user perspective
- R. Baeza-Yates, Á. Pereira, and N. Ziviani. Genealogical trees on the web: a search engine user perspective. In Proceedings of WWW, 2008.
- (2008) Proceedings of WWW
- Baeza-Yates, R.¹ Pereira, A.² Ziviani, N.³

5
- 36448985203
- A comparison of sentence retrieval techniques
- N. Balasubramanian, J. Allan, and W. B. Croft. A comparison of sentence retrieval techniques. In Proceedings of SIGIR, pages 813-814, 2007.
- (2007) Proceedings of SIGIR , pp. 813-814
- Balasubramanian, N.¹ Allan, J.² Croft, W.B.³

6
- 41849143005
- Utilizing passage-based language models for document retrieval
- M. Bendersky and O. Kurland. Utilizing passage-based language models for document retrieval. In Proceedings of ECIR, pages 162-174, 2008.
- (2008) Proceedings of ECIR , pp. 162-174
- Bendersky, M.¹ Kurland, O.²

7
- 33646126481
- A Scalable System for Identifying Co-derivative Documents
- Y. Bernstein and J. Zobel. A Scalable System for Identifying Co-derivative Documents. In Proceedings of SPIRE, 2004.
- (2004) Proceedings of SPIRE
- Bernstein, Y.¹ Zobel, J.²

8
- 0038589165
- The anatomy of a large-scale hypertextual Web search engine
- S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-7):107-117, 1998.
- (1998) Computer Networks and ISDN Systems , vol.30 , Issue.1-7 , pp. 107-117
- Brin, S.¹ Page, L.²

9
- 79956075292
- Identifying and Filtering Near-Duplicate Documents
- A. Broder. Identifying and Filtering Near-Duplicate Documents. In Proceedings of CPM, pages 1-10, 2000.
- (2000) Proceedings of CPM , pp. 1-10
- Broder, A.¹

10
- 0036040277
- Similarity estimation techniques from rounding algorithms
- M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of STOC, pages 380-388, 2002.
- (2002) Proceedings of STOC , pp. 380-388
- Charikar, M.S.¹

11
- 8644252773
- Using temporal profiles of queries for precision prediction
- F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In Proceedings of SIGIR, pages 18-24, 2004.
- (2004) Proceedings of SIGIR , pp. 18-24
- Diaz, F.¹ Jones, R.²

12
- 84885639910
- Detecting phrase-level duplication on the world wide web
- D. Fetterly, M. Manasse, and M. Najork. Detecting phrase-level duplication on the world wide web. In Proceedings of SIGIR, pages 170-177, 2005.
- (2005) Proceedings of SIGIR , pp. 170-177
- Fetterly, D.¹ Manasse, M.² Najork, M.³

13
- 84880498138
- DOM-based content extraction of HTML documents
- S. Gupta, G. Kaiser, D. Neistadt, and P. Grimm. DOM-based content extraction of HTML documents. In Proceedings of WWW, pages 207-214, 2003.
- (2003) , pp. 207-214
- Gupta, S.¹ Kaiser, G.² Neistadt, D.³ Grimm, P.⁴

14
- 77953061730
- Topic-sensitive PageRank
- T. Haveliwala. Topic-sensitive PageRank. In Proceedings of WWW, 2002.
- (2002) Proceedings of WWW
- Haveliwala, T.¹

15
- 33750296887
- Finding near-duplicate web pages: A large-scale evaluation of algorithms
- M. Henzinger. Finding near-duplicate web pages: a large-scale evaluation of algorithms. In Proceedings of SIGIR, pages 284-291, 2006.
- (2006) Proceedings of SIGIR , pp. 284-291
- Henzinger, M.¹

16
- 4243148480
- Authoritative sources in a hyperlinked environment
- J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5):604-632, 1999.
- (1999) Journal of the ACM (JACM) , vol.46 , Issue.5 , pp. 604-632
- Kleinberg, J.¹

17
- 0034785304
- Relevance based language models
- V. Lavrenko and W. Croft. Relevance based language models. In Proceedings of SIGIR, pages 120-127, 2001.
- (2001) In Proceedings of SIGIR , pp. 120-127
- Lavrenko, V.¹ Croft, W.²

18
- 35348861901
- Web projections: Learning from contextual subgraphs of the web
- J. Leskovec, S. Dumais, and E. Horvitz. Web projections: learning from contextual subgraphs of the web. In Proceedings of WWW, pages 471-480, 2007.
- (2007) , pp. 471-480
- Leskovec, J.¹ Dumais, S.² Horvitz, E.³

19
- 8644258207
- Time-based language models
- X. Li and B. W. Croft. Time-based language models. In In Proceedings of CIKM, pages 469-475, 2003.
- (2003) In Proceedings of CIKM , pp. 469-475
- Li, X.¹ Croft, B.W.²

20
- 84996678707
- Information extraction: Distilling structured data from unstructured text
- A. McCallum. Information extraction: distilling structured data from unstructured text. Queue, 3(9):48-57, 2005.
- (2005) Queue , vol.3 , Issue.9 , pp. 48-57
- McCallum, A.¹

21
- 29244457315
- Discovering evolutionary theme patterns from text: An exploration of temporal text mining
- Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In Proceeding of KDD, pages 198-207, 2005.
- (2005) Proceeding of KDD , pp. 198-207
- Mei, Q.¹ Zhai, C.²

22
- 33745797351
- Similarity measures for tracking information flow
- D. Metzler, Y. Bernstein, W. B. Croft, A. Moffat, and J. Zobel. Similarity measures for tracking information flow. In Proceedings of CIKM, 2005.
- (2005) Proceedings of CIKM
- Metzler, D.¹ Bernstein, Y.² Croft, W.B.³ Moffat, A.⁴ Zobel, J.⁵

23
- 84885662673
- A Markov random field model for term dependencies
- D. Metzler and W. B. Croft. A Markov random field model for term dependencies. In Proceedings of SIGIR, pages 472-479, 2005.
- (2005) In Proceedings of SIGIR , pp. 472-479
- Metzler, D.¹ Croft, W.B.²

24
- 37149036775
- A translation model for sentence retrieval
- V. Murdock and W. B. Croft. A translation model for sentence retrieval. In Proceedings of HLT/EMNLP, pages 684-691, 2005.
- (2005) Proceedings of HLT/EMNLP , pp. 684-691
- Murdock, V.¹ Croft, W.B.²

25
- 0032268440
- A language modeling approach to information retrieval
- J. M. Ponte and B. W. Croft. A language modeling approach to information retrieval. In Proceedings of SIGIR, pages 275-281, 1998.
- (1998) Proceedings of SIGIR , pp. 275-281
- Ponte, J.M.¹ Croft, B.W.²

26
- 84881219500
- A maximum entropy approach to identifying sentence boundaries
- J. C. Reynar and A. Ratnaparkhi. A maximum entropy approach to identifying sentence boundaries. In Proceedings of ANLP, pages 16-19, 1997.
- (1997) Proceedings of ANLP , pp. 16-19
- Reynar, J.C.¹ Ratnaparkhi, A.²

27
- 33745199116
- Milestones in Time: The Value of Landmarks in Retrieving Information from Personal Stores
- M. Ringel, E. Cutrell, S. Dumais, and E. Horvitz. Milestones in Time: The Value of Landmarks in Retrieving Information from Personal Stores. In Proceedings of INTERACT, pages 184-191, 2003.
- (2003) Proceedings of INTERACT , pp. 184-191
- Ringel, M.¹ Cutrell, E.² Dumais, S.³ Horvitz, E.⁴

28
- 57349177560
- Local text reuse detection
- J. Seo and W. B. Croft. Local text reuse detection. In Proceedings of SIGIR, 2008.
- (2008) Proceedings of SIGIR
- Seo, J.¹ Croft, W.B.²

29
- 70349132329
- N. Shivakumar and H. Garcia-Molina. SCAM: Copy detection mechanisms for digital documents. In Proceedings of Digital Libraries, 1995.
- N. Shivakumar and H. Garcia-Molina. SCAM: Copy detection mechanisms for digital documents. In Proceedings of Digital Libraries, 1995.

30
- 8644264918
- Timemines: Constructing timelines with statistical models of word usage
- R. Swan and D. Jensen. Timemines: Constructing timelines with statistical models of word usage. In Proceedings of KDD, pages 73-80, 2000.
- (2000) Proceedings of KDD , pp. 73-80
- Swan, R.¹ Jensen, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.