-
1
-
-
10944220836
-
Exploiting interclass rules for focused crawling
-
Altingovde I.S., and Ulusoy O. Exploiting interclass rules for focused crawling. IEEE Intelligent Systems 19 6 (2004) 66-73
-
(2004)
IEEE Intelligent Systems
, vol.19
, Issue.6
, pp. 66-73
-
-
Altingovde, I.S.1
Ulusoy, O.2
-
2
-
-
84880240041
-
Searching the Web
-
Arasu A., Cho J., Garcia-Molina H., and Raghavan S. Searching the Web. ACM Transactions on Internet Technologies 1 1 (2001) 2-43
-
(2001)
ACM Transactions on Internet Technologies
, vol.1
, Issue.1
, pp. 2-43
-
-
Arasu, A.1
Cho, J.2
Garcia-Molina, H.3
Raghavan, S.4
-
3
-
-
77953053635
-
-
Baeza-Yates, R., Castillo, C., Marin, M., & Rodriguez, A. (2005). Crawling a country: better strategies than breadth-first for Web page ordering. In Special interest tracks and posters of the 14th international conference on World Wide Web. Chiba, Japan.
-
-
-
-
5
-
-
84885649550
-
-
Bender, M., Michel, S., Triantafillou, P., Weikum, G., & Zimmer, C. (2005). Improving collection selection with overlap awareness in P2P search engines. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (pp. 67-74). Salvador, Brazil.
-
-
-
-
6
-
-
33846424034
-
-
Boldi, P., Codenotti, B., Santini, M., & Vigna, S. (2002). Ubicrawler: a scalable fully distributed Web crawler. In Proceedings of AusWeb02, the eighth Australian World Wide Web conference.
-
-
-
-
7
-
-
33846422514
-
-
Cambazoglu, B. B., & Aykanat, C. (2005). Harbinger machine learning toolkit manual. Technical Report, BU-CE-0502, Bilkent University, Department of Computer Engineering. Ankara, Turkey.
-
-
-
-
8
-
-
29244433741
-
Performance of query processing implementations in ranking-based text retrieval systems using inverted indices
-
Cambazoglu B.B., and Aykanat C. Performance of query processing implementations in ranking-based text retrieval systems using inverted indices. Information Processing & Management 42 4 (2006) 875-898
-
(2006)
Information Processing & Management
, vol.42
, Issue.4
, pp. 875-898
-
-
Cambazoglu, B.B.1
Aykanat, C.2
-
10
-
-
3343007292
-
Efficiency and effectiveness of query processing in cluster-based retrieval
-
Can F., Altingovde I.S., and Demir E. Efficiency and effectiveness of query processing in cluster-based retrieval. Information Systems 29 8 (2004) 697-717
-
(2004)
Information Systems
, vol.29
, Issue.8
, pp. 697-717
-
-
Can, F.1
Altingovde, I.S.2
Demir, E.3
-
11
-
-
0033294474
-
Focused crawling: a new approach to topic-specific Web resource discovery
-
Chakrabarti S., van den Berg M., and Dom B. Focused crawling: a new approach to topic-specific Web resource discovery. Computer Networks 31 11-16 (1999) 1623-1640
-
(1999)
Computer Networks
, vol.31
, Issue.11-16
, pp. 1623-1640
-
-
Chakrabarti, S.1
van den Berg, M.2
Dom, B.3
-
12
-
-
84877324786
-
-
Cho, J., & Garcia-Molina, H. (2000). The evolution of the Web and implications for an incremental crawler. In Proceedings of the 26th international conference on very large data bases (pp. 200-209). Cairo, Egypt.
-
-
-
-
13
-
-
67649866504
-
-
Cho, J., & Garcia-Molina, H. (2002). Parallel Crawlers. In Proceedings of the seventh World-Wide Web conference (pp. 124-135).
-
-
-
-
14
-
-
20444396637
-
-
Cho, J., Garcia-Molina, H., & Page, L. (1998). Efficient crawling through URL ordering. In Proceedings of the 7th international World Wide Web conference (pp. 161-172). Brisbane, Australia.
-
-
-
-
16
-
-
70350672544
-
-
Diligenti, M., Coetzee, F., Lawrence, S., Giles, C. L., & Gori, M. (2000). Focused crawling using context graphs. In Proceedings of the 26th international conference on very large data bases (pp. 527-534). Cairo, Egypt.
-
-
-
-
18
-
-
84949950595
-
-
Han, E., Karypis, G., & Kumar, V. (2002). Text categorization using weight adjusted k-nearest neighbor classification. In Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining (pp. 53-65).
-
-
-
-
19
-
-
79951675059
-
Mercator: a scalable, extensible Web crawler
-
Heydon A., and Najork M. Mercator: a scalable, extensible Web crawler. World Wide Web 2 4 (1999) 219-229
-
(1999)
World Wide Web
, vol.2
, Issue.4
, pp. 219-229
-
-
Heydon, A.1
Najork, M.2
-
20
-
-
85027831702
-
-
Kan, M.-Y. (2004). Web page categorization without the Web page. In Proceedings of the 13th international World Wide Web conference (pp. 262-263).
-
-
-
-
21
-
-
26944450174
-
Grid-enabled Weka: a toolkit for machine learning on the grid
-
Khoussainov R., Zuo X., and Kushmerick N. Grid-enabled Weka: a toolkit for machine learning on the grid. ERCIM News 59 (2004)
-
(2004)
ERCIM News 59
-
-
Khoussainov, R.1
Zuo, X.2
Kushmerick, N.3
-
23
-
-
0031095329
-
Document ranking and the vector-space model
-
Lee D.L., Chuang H., and Seamons K. Document ranking and the vector-space model. IEEE Software 14 2 (1997) 67-75
-
(1997)
IEEE Software
, vol.14
, Issue.2
, pp. 67-75
-
-
Lee, D.L.1
Chuang, H.2
Seamons, K.3
-
24
-
-
33846432890
-
-
Lewis, D. D. (1992). Feature selection and feature extraction for text categorization. In Proceedings of speech and natural language workshop (pp. 212-217).
-
-
-
-
25
-
-
33846430158
-
-
Lewis, D. D., & Ringuette, M. (1994). A comparison of two learning algorithms for text categorization. In Proceedings of the third annual symposium on document analysis and information retrieval (pp. 81-93).
-
-
-
-
26
-
-
33846409622
-
-
Long, X., & Suel, T. (2003). Optimized query execution in large search engines. In Proceedings of the 29th international conference on very large databases. Berlin, Germany.
-
-
-
-
27
-
-
33846420781
-
-
McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text classification. In AAAI-98 Workshop on learning for text categorization.
-
-
-
-
28
-
-
0038895148
-
Building a distributed full-text index for the Web
-
Melnik S., Raghavan S., Yang B., and Garcia-Molina H. Building a distributed full-text index for the Web. ACM Transactions on Information Systems 19 3 (2001) 217-241
-
(2001)
ACM Transactions on Information Systems
, vol.19
, Issue.3
, pp. 217-241
-
-
Melnik, S.1
Raghavan, S.2
Yang, B.3
Garcia-Molina, H.4
-
30
-
-
85006710010
-
-
Najork, M., & Wiener, J. L. (2001). Breadth-first crawling yields high-quality pages. In Proceedings of the 10th international conference on World Wide Web (pp. 114-118). Hong Kong, Hong Kong.
-
-
-
-
31
-
-
0030651099
-
-
Ng, H. T., Goh, W. B., & Low, K. L. (1997). Feature selection, perceptron learning, and a usability case study for text categorization. In Proceedings of the 20th international conference on research and development in information retrieval (pp. 67-73).
-
-
-
-
32
-
-
0038589165
-
-
Page, L., & Brin, S. (1998). The anatomy of a large-scale hypertextual Web search engine. In Proceedings of the seventh World-Wide Web conference (pp. 107-117).
-
-
-
-
33
-
-
0031643676
-
-
Ribeiro-Neto, B. A., & Barbosa, R. A. (1998). Query performance for tightly coupled distributed digital libraries. In Proceedings of the third ACM conference on digital libraries (pp. 182-190).
-
-
-
-
34
-
-
33846408815
-
-
Scholze, F., Haya, G., Vigen, J., & Prazak, P. (2004). Project GRACE: a grid based search tool for the global digital library. In 7th international conference on electronic theses and dissertations. Lexington, KY.
-
-
-
-
35
-
-
0002442796
-
Machine learning in automated text categorization
-
Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys 34 1 (2002) 1-47
-
(2002)
ACM Computing Surveys
, vol.34
, Issue.1
, pp. 1-47
-
-
Sebastiani, F.1
-
36
-
-
0036204395
-
-
Shkapenyuk, V., & Suel, T. (2002). Design and implementation of a high-performance distributed Web crawler. In International conference on data engineering (pp. 357-368).
-
-
-
-
37
-
-
1542270212
-
-
Sun, A., Lim, E. P., & Ng, W. K. (2002). Web classification using support vector machine. In Proceedings of the 4th international workshop on Web information and data management (pp. 96-99).
-
-
-
-
38
-
-
0032715585
-
-
Teng, S., Lu, Q., Eichstaedt, M., Ford, D., & Lehman, T. (1999). Collaborative Web crawling: information gathering/processing over Internet. In 32nd Hawaii international conference on system sciences.
-
-
-
-
39
-
-
0028447796
-
-
Tomasic, A., Garcia-Molina, H., & Shoens, K. (1994). Incremental updates of inverted lists for text document retrieval. In Proceedings of the 1994 ACM SIGMOD international conference on management of data (pp. 289-300). Minneapolis, Minnesota.
-
-
-
-
41
-
-
33846455585
-
-
Wilkinson, R., Zobel, J., & Sacks-Davis, R. (1995). Similarity measures for short queries. In Fourth text retrieval conference (TREC-4) (pp. 277-285). Gaithersburg, Maryland.
-
-
-
-
43
-
-
0001237318
-
Implementations of partial document ranking using inverted files
-
Wong W.Y.P., and Lee D.K. Implementations of partial document ranking using inverted files. Information Processing and Management 29 5 (1993) 647-669
-
(1993)
Information Processing and Management
, vol.29
, Issue.5
, pp. 647-669
-
-
Wong, W.Y.P.1
Lee, D.K.2
-
44
-
-
27144441097
-
An evaluation of statistical approaches to text categorization
-
Yang Y. An evaluation of statistical approaches to text categorization. Journal of Information Retrieval 1 1/2 (1999) 67-88
-
(1999)
Journal of Information Retrieval
, vol.1
, Issue.1-2
, pp. 67-88
-
-
Yang, Y.1
-
45
-
-
84937389622
-
-
Zeinalipour-Yazti, D., & Dikaiakos, M. D. (2002). Design and implementation of a distributed crawler and filtering processor. In Proceedings of the next generation information technologies and systems (pp. 58-74).
-
-
-
-
46
-
-
33846463208
-
-
Zobel, J., Moffat, A., & Sacks-Davis, R. (1992). An efficient indexing technique for full-text database systems. In Proceedings of the 18th international conference on very large databases (pp. 352-362). Vancouver, Canada.
-
-
-
|