SCOPUS 정보 검색 플랫폼

Proceedings - IEEE International Conference on Data Mining, ICDM

Volumn , Issue , 2008, Pages 550-559

xCrawl: A high-recall crawling method for web mining

(3) Shchekotykhin, Kostyantyn a Jannach, Dietmar b Friedrich, Gerhard a

a UNIVERSITY OF KLAGENFURT (Austria)

b TU DORTMUND UNIVERSITY (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

APPLICATION SCENARIO; FACT EXTRACTION; FOCUSED CRAWLING; INFORMATION EXTRACTION; NAVIGATIONAL STRUCTURES; PRODUCT AND SERVICES; QUERY GENERATION; REDUNDANT DATA; TECHNIQUES USED; WEB DOCUMENT; WEB MINING; WEB PAGE; WEB SOURCES;

INFORMATION MANAGEMENT; MINING; MINING MACHINERY; WEBSITES;

DATA MINING;

EID: 67049169361 PISSN: 15504786 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICDM.2008.121 Document Type: Conference Paper

Times cited : (3)

References (19)

1
- 84874371227
- Intelligent crawling on the world wide web with arbitrary predicates
- New York, NY, USA
- C. C. Aggarwal, F. Al-Garawi, and P. S. Yu. Intelligent crawling on the world wide web with arbitrary predicates. In Proceedings of the 10th International World Wide Web Conference, pages 96-105, New York, NY, USA, 2001.
- (2001) Proceedings of the 10th International World Wide Web Conference , pp. 96-105
- Aggarwal, C.C.¹ Al-Garawi, F.² Yu, P.S.³

2
- 0344065593
- Querying text databases for efficient information extraction
- Bangalore, India
- E. Agichtein and L. Gravano. Querying text databases for efficient information extraction. In Proceedings of the 19th IEEE International Conference on Data Engineering, pages 113-124, Bangalore, India, 2003.
- (2003) Proceedings of the 19th IEEE International Conference on Data Engineering , pp. 113-124
- Agichtein, E.¹ Gravano, L.²

3
- 34548764453
- Crawling for domainspecific hidden web resources
- Washington, DC, USA
- A. Bergholz and B. Chidlovskii. Crawling for domainspecific hidden web resources. In Proceedings of the Fourth International Conference on Web Information Systems Engineering, pages 125-133, Washington, DC, USA, 2003.
- (2003) Proceedings of the Fourth International Conference on Web Information Systems Engineering , pp. 125-133
- Bergholz, A.¹ Chidlovskii, B.²

4
- 0038589165
- The anatomy of a large-scale hypertextual web search engine
- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1-7):107-117, 1998.
- (1998) Computer Networks and ISDN Systems , vol.30 , Issue.1-7 , pp. 107-117
- Brin, S.¹ Page, L.²

5
- 77953064623
- Accelerated focused crawling through online relevance feedback
- Honolulu, Hawaii, USA
- S. Chakrabarti, K. Punera, and M. Subramanyam. Accelerated focused crawling through online relevance feedback. In Proceedings of the 11th International World Wide Web Conference, pages 148-159, Honolulu, Hawaii, USA, 2002.
- (2002) Proceedings of the 11th International World Wide Web Conference , pp. 148-159
- Chakrabarti, S.¹ Punera, K.² Subramanyam, M.³

6
- 0033294474
- Focused crawling: A new approach to topic-specific web resource discovery
- S. Chakrabarti, M. van den Berg, and B. Dom. Focused crawling: a new approach to topic-specific web resource discovery. Computer Networks, 31(11-16):1623-1640, 1999.
- (1999) Computer Networks , vol.31 , Issue.11-16 , pp. 1623-1640
- Chakrabarti, S.¹ Van Den Berg, M.² Dom, B.³

7
- 0034172483
- Learning to construct knowledge bases from the WorldWideWeb
- M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery. Learning to construct knowledge bases from theWorldWideWeb. Artificial Intelligence, 118(1):69-113, 2000.
- (2000) Artificial Intelligence , vol.118 , Issue.1 , pp. 69-113
- Craven, M.¹ DiPasquo, D.² Freitag, D.³ McCallum, A.⁴ Mitchell, T.⁵ Nigam, K.⁶ Slattery, S.⁷

8
- 35348902039
- The discoverability of the web
- DOI 10.1145/1242572.1242630, 16th International World Wide Web Conference, WWW2007
- A. Dasgupta, A. Ghosh, R. Kumar, C. Olston, S. Pandey, and A. Tomkins. The discoverability of the web. In Proceedings of the 16th international conference on World Wide Web, pages 421-430, Banff, Alberta, Canada, 2007. (Pubitemid 47582271)
- (2007) 16th International World Wide Web Conference, WWW2007 , pp. 421-430
- Dasgupta, A.¹ Ghosh, A.² Kumar, R.³ Olston, C.⁴ Pandey, S.⁵ Tomkins, A.⁶

9
- 70350672544
- Focused crawling using context graphs
- Cairo, Egypt
- M. Diligenti, F. Coetzee, S. Lawrence, C. L. Giles, and M. Gori. Focused crawling using context graphs. In Proceedings of 26th International Conference on Very Large Data Bases, pages 527-534, Cairo, Egypt, 2000.
- (2000) Proceedings of 26th International Conference on Very Large Data Bases , pp. 527-534
- Diligenti, M.¹ Coetzee, F.² Lawrence, S.³ Giles, C.L.⁴ Gori, M.⁵

10
- 36248993305
- Focused Web crawling: A generic framework for specifying the user interest and for adaptive crawling strategies
- Roma, Italy
- M. Ester, M. Grob, and H. Kriegel. Focused Web crawling: A generic framework for specifying the user interest and for adaptive crawling strategies. In Proceedings of 27th International Conference on Very Large Data Bases, pages 321-329, Roma, Italy, 2001.
- (2001) Proceedings of 27th International Conference on Very Large Data Bases , pp. 321-329
- Ester, M.¹ Grob, M.² Kriegel, H.³

11
- 35348900845
- Towards domain-independent information extraction from web tables
- DOI 10.1145/1242572.1242583, 16th International World Wide Web Conference, WWW2007
- W. Gatterbauer, P. Bohunsky, M. Herzog, B. Krüpl, and B. Pollak. Towards domain-independent information extraction from web tables. In Proceedings of the 16th International World Wide Web conference, pages 71-80, Banff, Alberta, Canada, 2007. (Pubitemid 47582240)
- (2007) 16th International World Wide Web Conference, WWW2007 , pp. 71-80
- Gatterbauer, W.¹ Bohunsky, P.² Herzog, M.³ Krupl, B.⁴ Pollak, B.⁵

12
- 34250654176
- To search or to crawl?: Towards a query optimizer for text-centric tasks
- DOI 10.1145/1142473.1142504, SIGMOD 2006 - Proceedings of the ACM SIGMOD International Conference on Management of Data
- P. G. Ipeirotis, E. Agichtein, P. Jain, and L. Gravano. To search or to crawl? Towards a query optimizer for textcentric tasks. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 265-276, New York, NY, USA, 2006. (Pubitemid 46946519)
- (2006) Proceedings of the ACM SIGMOD International Conference on Management of Data , pp. 265-276
- Ipeirotis, P.G.¹ Agichtein, E.² Jain, P.³ Gravano, L.⁴

13
- 4243148480
- Authoritative sources in a hyperlinked environment
- J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5):604-632, 1999.
- (1999) Journal of the ACM (JACM) , vol.46 , Issue.5 , pp. 604-632
- Kleinberg, J.¹

14
- 0033297068
- Trawling the Web for emerging cyber-communities
- R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the Web for emerging cyber-communities. Computer Networks, 31(11-16):1481-1493, 1999.
- (1999) Computer Networks , vol.31 , Issue.11-16 , pp. 1481-1493
- Kumar, R.¹ Raghavan, P.² Rajagopalan, S.³ Tomkins, A.⁴

15
- 0000133751
- Using reinforcement learning to spider the web efficiently
- San Francisco, CA, USA
- J. Rennie and A. McCallum. Using reinforcement learning to spider the web efficiently. In Proceedings of the Sixteenth International Conference on Machine Learning, pages 335- 343, San Francisco, CA, USA, 1999.
- (1999) Proceedings of the Sixteenth International Conference on Machine Learning , pp. 335-343
- Rennie, J.¹ McCallum, A.²

16
- 34250618783
- Do not crawl in the DUST: Different URLs with similar text
- DOI 10.1145/1135777.1135992, Proceedings of the 15th International Conference on World Wide Web
- U. Schonfeld, Z. Bar-Yossef, and I. Keidar. Do not crawl in the DUST: different URLs with similar text. In Proceedings of the 15th International World Wide Web Conference, pages 1015-1016, New York, NY, USA, 2006. (Pubitemid 46946760)
- (2006) Proceedings of the 15th International Conference on World Wide Web , pp. 1015-1016
- Schonfeld, U.¹ Bar-Yossef, Z.² Keidar, I.³

17
- 49949112951
- AllRight: Automatic ontology instantiation from tabular web documents
- Busan, Korea
- K. Shchekotykhin, D. Jannach, G. Friedrich, and O. Kozeruk. AllRight: Automatic ontology instantiation from tabular web documents. In Proceedings of 6th International Semantic Web Conference, pages 466-479, Busan, Korea, 2007.
- (2007) Proceedings of 6th International Semantic Web Conference , pp. 466-479
- Shchekotykhin, K.¹ Jannach, D.² Friedrich, G.³ Kozeruk, O.⁴

18
- 0003957032
- Morgan Kaufmann
- I. Witten and E. Frank. Data mining: Practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, 2000.
- (2000) Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations
- Witten, I.¹ Frank, E.²

19
- 0011399035
- PEBL: Positive example based learning for web page classification using SVM
- New York, NY, USA
- H. Yu, J. Han, and K. C.-C. Chang. PEBL: positive example based learning for web page classification using SVM. In KDD 02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 239-248, New York, NY, USA, 2002.
- (2002) KDD'02: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 239-248
- Yu, H.¹ Han, J.² Chang, K.C.-C.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.