메뉴 건너뛰기




Volumn 2006, Issue , 2006, Pages 257-266

Hierarchical topic segmentation of websites

Author keywords

Classification; Facility Location; Gain Ratio; KL distance; Tree Partitioning; Website Hierarchy; Website Segmentation

Indexed keywords

ALGORITHMS; DATA REDUCTION; HIERARCHICAL SYSTEMS; IMAGE SEGMENTATION; TREES (MATHEMATICS);

EID: 33749551229     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1150402.1150433     Document Type: Conference Paper
Times cited : (17)

References (34)
  • 1
    • 84880468771 scopus 로고    scopus 로고
    • Mining newsgroups using networks arising from social behavior
    • R. Agrawal, S. Rajagopalan, R. Srikant, and Y. Xu. Mining newsgroups using networks arising from social behavior. In 12th WWW, pages 529-535, 2003.
    • (2003) 12th WWW , pp. 529-535
    • Agrawal, R.1    Rajagopalan, S.2    Srikant, R.3    Xu, Y.4
  • 3
    • 24944472983 scopus 로고    scopus 로고
    • Clustering on the unit hypersphere using von Mises-Fisher distributions
    • A. Banerjee, I. S. Dhillon, J. Ghosh, and S. Sra. Clustering on the unit hypersphere using von Mises-Fisher distributions. JMLR, 6:1345-1382, 2005.
    • (2005) JMLR , vol.6 , pp. 1345-1382
    • Banerjee, A.1    Dhillon, I.S.2    Ghosh, J.3    Sra, S.4
  • 4
    • 0034288398 scopus 로고    scopus 로고
    • A comparison of techniques to find mirrored hosts on the WWW
    • K. Bharat, A. Broder, J. Dean, and M. R. Henzinger. A comparison of techniques to find mirrored hosts on the WWW. JASIS, 51(12):1114-1122, 2000.
    • (2000) JASIS , vol.51 , Issue.12 , pp. 1114-1122
    • Bharat, K.1    Broder, A.2    Dean, J.3    Henzinger, M.R.4
  • 5
    • 1542287501 scopus 로고    scopus 로고
    • Modeling annotated data
    • D. Blei and M. Jordan. Modeling annotated data. In 26th SIGIR, pages 127-134, 2003.
    • (2003) 26th SIGIR , pp. 127-134
    • Blei, D.1    Jordan, M.2
  • 6
    • 0032090684 scopus 로고    scopus 로고
    • Enhanced hypertext classification using hyperlinks
    • S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext classification using hyperlinks. In SIGMOD, pages 307-318, 1998.
    • (1998) SIGMOD , pp. 307-318
    • Chakrabarti, S.1    Dom, B.2    Indyk, P.3
  • 7
    • 84885969802 scopus 로고
    • Omega: A general formulation of the rand index of cluster recovery suitable for non-disjoint solutions
    • L. M. Collins and C. W. Dent. Omega: A general formulation of the rand index of cluster recovery suitable for non-disjoint solutions. Multivariate Behavioral Research, 23(2):231-242, 1988.
    • (1988) Multivariate Behavioral Research , vol.23 , Issue.2 , pp. 231-242
    • Collins, L.M.1    Dent, C.W.2
  • 8
    • 0034785403 scopus 로고    scopus 로고
    • Effective site finding using link anchor information
    • N. Craswell, D. Hawking, and S. Roberston. Effective site finding using link anchor information. In 24th SIGIR, pages 250-257, 2001.
    • (2001) 24th SIGIR , pp. 250-257
    • Craswell, N.1    Hawking, D.2    Roberston, S.3
  • 9
    • 3042815190 scopus 로고    scopus 로고
    • Bayesian network model for semi-structured document classification
    • L. Denoyer and P. Gallinari. Bayesian network model for semi-structured document classification. Information Processing and Management, 40(5):807-827, 2004.
    • (2004) Information Processing and Management , vol.40 , Issue.5 , pp. 807-827
    • Denoyer, L.1    Gallinari, P.2
  • 10
    • 84951796310 scopus 로고    scopus 로고
    • Classification of HTML documents by hidden tree-Markov models
    • M. Diligenti, M. Gori, M. Maggini, and F. Scarselli. Classification of HTML documents by hidden tree-Markov models. In 6th ICDAR, pages 849-853, 2001.
    • (2001) 6th ICDAR , pp. 849-853
    • Diligenti, M.1    Gori, M.2    Maggini, M.3    Scarselli, F.4
  • 11
    • 0242709390 scopus 로고    scopus 로고
    • Web site mining: A new way to spot competitors, customers and suppliers in the world wide web
    • M. Ester, H.-P. Kriegel, and M. Schubert. Web site mining: A new way to spot competitors, customers and suppliers in the world wide web. In 8th KDD, pages 249-258, 2002.
    • (2002) 8th KDD , pp. 249-258
    • Ester, M.1    Kriegel, H.-P.2    Schubert, M.3
  • 14
    • 0032119668 scopus 로고    scopus 로고
    • The hierarchical hidden Markov model: Analysis and applications
    • S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32(1):41-62, 1998.
    • (1998) Machine Learning , vol.32 , Issue.1 , pp. 41-62
    • Fine, S.1    Singer, Y.2    Tishby, N.3
  • 15
    • 67649850603 scopus 로고    scopus 로고
    • Surfing the web by site
    • D. Gibson. Surfing the web by site. In 13th WWW, pages 496-497, 2004.
    • (2004) 13th WWW , pp. 496-497
    • Gibson, D.1
  • 16
    • 77953053369 scopus 로고    scopus 로고
    • The volume and evolution of web page templates
    • D. Gibson, K. Punera, and A. Tomkins. The volume and evolution of web page templates. In 14th WWW, pages 830-839, 2005.
    • (2005) 14th WWW , pp. 830-839
    • Gibson, D.1    Punera, K.2    Tomkins, A.3
  • 17
    • 0020155046 scopus 로고
    • The distance-domination numbers of trees
    • W. L. Hsu. The distance-domination numbers of trees.Operations Research Letters, 1:96-100, 1982.
    • (1982) Operations Research Letters , vol.1 , pp. 96-100
    • Hsu, W.L.1
  • 18
    • 84880467894 scopus 로고    scopus 로고
    • The eigentrust algorithm for reputation managementin P2P networks
    • S. D. Kamvar, M. T. Scholsser, and H. Garcia-Molina.The eigentrust algorithm for reputation managementin P2P networks. In 12th WWW, pages 640-651, 2003.
    • (2003) 12th WWW , pp. 640-651
    • Kamvar, S.D.1    Scholsser, M.T.2    Garcia-Molina, H.3
  • 19
    • 0018678438 scopus 로고
    • An algorithmic approach to network location problems, part II: P-medians
    • O. Kariv and S. L. Haikim. An algorithmic approach to network location problems, part II: p-medians. SIAM J. on Applied Mathematics, 37:539-560, 1979.
    • (1979) SIAM J. on Applied Mathematics , vol.37 , pp. 539-560
    • Kariv, O.1    Haikim, S.L.2
  • 20
    • 0002346866 scopus 로고    scopus 로고
    • Hierarchically classifying documents using very few words
    • D. Koller and M. Sahami. Hierarchically classifying documents using very few words. In 14th ICML, pages 170-178, 1997.
    • (1997) 14th ICML , pp. 170-178
    • Koller, D.1    Sahami, M.2
  • 22
    • 0003241643 scopus 로고    scopus 로고
    • Practical issues for automated categorization of web sites
    • J. Pierre. Practical issues for automated categorization of web sites. In ECDL 2000 Workshop on Semantic Web, 2000.
    • (2000) ECDL 2000 Workshop on Semantic Web
    • Pierre, J.1
  • 24
    • 0037653041 scopus 로고
    • Induction of decision trees
    • J. W. Shavlik and T. G. Dietterich, editors, Morgan Kaufmann
    • J. R. Quinlan. Induction of decision trees. In J. W. Shavlik and T. G. Dietterich, editors, Readings in Machine Learning. Morgan Kaufmann, 1990.
    • (1990) Readings in Machine Learning
    • Quinlan, J.R.1
  • 25
    • 33744584654 scopus 로고
    • Originally
    • Originally in Machine Learning 1:81-106, 1986.
    • (1986) Machine Learning , vol.1 , pp. 81-106
  • 26
    • 0034511271 scopus 로고    scopus 로고
    • Web site analysis: Structure and evolution
    • F. Ricca and P. Tonella. Web site analysis: Structure and evolution. In 16th ICSM, pages 76-86, 2000.
    • (2000) 16th ICSM , pp. 76-86
    • Ricca, F.1    Tonella, P.2
  • 27
    • 33749562077 scopus 로고    scopus 로고
    • Do not crawl in the dust: Different urls with similar text
    • U. Schonfeld, Z. Bar-Yossef, and I. Keidar. Do not crawl in the dust: Different urls with similar text. In 15th WWW, 2006.
    • (2006) 15th WWW
    • Schonfeld, U.1    Bar-Yossef, Z.2    Keidar, I.3
  • 28
    • 33749570311 scopus 로고    scopus 로고
    • Undiscretized dynamic programming: Faster algorithms for facility location and related problems on trees
    • R. Shah and M. Farach-Colton. Undiscretized dynamic programming: Faster algorithms for facility location and related problems on trees. In 13th SODA, pages 108-115, 2002.
    • (2002) 13th SODA , pp. 108-115
    • Shah, R.1    Farach-Colton, M.2
  • 29
    • 18744385246 scopus 로고    scopus 로고
    • Web unit mining: Finding and classifying subgraphs of web pages
    • A. Sun and E.-P. Lim. Web unit mining: finding and classifying subgraphs of web pages. In 12th CIKM, pages 108-115, 2003.
    • (2003) 12th CIKM , pp. 108-115
    • Sun, A.1    Lim, E.-P.2
  • 30
    • 0030216223 scopus 로고    scopus 로고
    • 2) algorithm for the p-median and related problems on tree graphs
    • 2) algorithm for the p-median and related problems on tree graphs. Operations Research . Letters, 19:59-64, 1996.
    • (1996) Operations Research. Letters , vol.19 , pp. 59-64
    • Tamir, A.1
  • 31
    • 0002502515 scopus 로고    scopus 로고
    • Constructing, organizing, and visualizing collections of topically related web resources
    • L. Terveen, W. Hill, and B. Amento. Constructing, organizing, and visualizing collections of topically related web resources. ACM Transactions,on Computer-Human Interaction, 6(1):67-94, 1999.
    • (1999) ACM Transactions,on Computer-human Interaction , vol.6 , Issue.1 , pp. 67-94
    • Terveen, L.1    Hill, W.2    Amento, B.3
  • 32
    • 0345921608 scopus 로고    scopus 로고
    • Finding similar academic web sites with links, bibliometric couplings and colinks
    • M. Thelwall and D. Wilkinson. Finding similar academic web sites with links, bibliometric couplings and colinks. Information Processing and Management, 40(3):515-526, 2004.
    • (2004) Information Processing and Management , vol.40 , Issue.3 , pp. 515-526
    • Thelwall, M.1    Wilkinson, D.2
  • 33
    • 33646432158 scopus 로고    scopus 로고
    • Exploiting structure, annotation, and ontological knowledge for automatic classification of XML data
    • M. Theobald, R. Schenkel, and G. Weikum. Exploiting structure, annotation, and ontological knowledge for automatic classification of XML data. In 6th WebDB, pages 1-6, 2003.
    • (2003) 6th WebDB , pp. 1-6
    • Theobald, M.1    Schenkel, R.2    Weikum, G.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.