메뉴 건너뛰기




Volumn 45, Issue 5, 2009, Pages 499-512

Learning to recognize webpage genres

Author keywords

Character n grams; Web genre classification; Webpage representation

Indexed keywords

AUTOMATED EXTRACTION; AUTOMATIC DETECTION; BINARY REPRESENTATIONS; CHARACTER N-GRAMS; CHECK EXPERIMENT; E-SHOPS; FEATURE SETS; FEATURE TYPES; GENRE DETECTION; HOME PAGE; HTML TAGS; INFORMATION NEED; N-GRAMS; NOISY ENVIRONMENT; VARIABLE LENGTH; WEB GENRE CLASSIFICATION; WEB-PAGE; WEBPAGE REPRESENTATION;

EID: 67650140659     PISSN: 03064573     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ipm.2009.05.003     Document Type: Article
Times cited : (37)

References (35)
  • 8
    • 2942731012 scopus 로고    scopus 로고
    • An extensive empirical study of feature selection metrics for text classification
    • Forman G. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research 3 (2003) 1289-1305
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 1289-1305
    • Forman, G.1
  • 9
    • 33750271571 scopus 로고    scopus 로고
    • N-gram feature selection for authorship identification
    • Proceedings of the 12th international conference on artificial intelligence: methodology, systems, applications, Springer
    • Houvardas, J., & Stamatatos, E. (2006). N-gram feature selection for authorship identification. In Proceedings of the 12th international conference on artificial intelligence: methodology, systems, applications, LNCS, 4183 (pp. 77-86). Springer.
    • (2006) LNCS , vol.4183 , pp. 77-86
    • Houvardas, J.1    Stamatatos, E.2
  • 10
    • 84957069814 scopus 로고    scopus 로고
    • Text categorization with support vector machines: Learning with many relevant features
    • Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European conference on machine learning (pp. 137-142).
    • (1998) Proceedings of the 10th European conference on machine learning , pp. 137-142
    • Joachims, T.1
  • 12
    • 33745868242 scopus 로고    scopus 로고
    • N-gram-based author profiles for authorship attribution
    • association for computational linguistics
    • Keselj, V., Peng, F., Cercone, N., & Thomas, C. (2003). N-gram-based author profiles for authorship attribution. In Proceedings of the conference of pacific association for computational linguistics.
    • (2003) Proceedings of the conference of pacific
    • Keselj, V.1    Peng, F.2    Cercone, N.3    Thomas, C.4
  • 18
    • 16244366129 scopus 로고    scopus 로고
    • Multiple sets of features for automatic genre classification of web documents
    • Lim C.S., Lee K.J., and Kim G.C. Multiple sets of features for automatic genre classification of web documents. Information Processing and Management 41 5 (2005) 1263-1276
    • (2005) Information Processing and Management , vol.41 , Issue.5 , pp. 1263-1276
    • Lim, C.S.1    Lee, K.J.2    Kim, G.C.3
  • 21
    • 77049104346 scopus 로고    scopus 로고
    • Genre classification of web pages: User study and feasibility analysis
    • Biundo S., Fruhwirth T., and Palm G. (Eds), Springer
    • Meyer zu Eissen S., and Stein B. Genre classification of web pages: User study and feasibility analysis. In: Biundo S., Fruhwirth T., and Palm G. (Eds). KI 2004: Advances in artificial intelligence (2004), Springer 256-269
    • (2004) KI 2004: Advances in artificial intelligence , pp. 256-269
    • Meyer zu Eissen, S.1    Stein, B.2
  • 23
    • 0141990695 scopus 로고    scopus 로고
    • Theoretical and empirical analysis of ReliefF and RReliefF
    • Robnik-Sikonja M., and Kononenko I. Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning 53 1-2 (2003) 23-69
    • (2003) Machine Learning , vol.53 , Issue.1-2 , pp. 23-69
    • Robnik-Sikonja, M.1    Kononenko, I.2
  • 24
    • 39749091418 scopus 로고    scopus 로고
    • Ph.D. Thesis, University of North Carolina at Chapel Hill
    • Rosso, M. (2005). Using genre to improve web search. Ph.D. Thesis, University of North Carolina at Chapel Hill.
    • (2005) Using genre to improve web search
    • Rosso, M.1
  • 26
    • 39649112405 scopus 로고    scopus 로고
    • Zero, single, or multi? Genre of web pages through the users' perspective
    • Santini M. Zero, single, or multi? Genre of web pages through the users' perspective. Information Processing and Management 44 2 (2008) 702-737
    • (2008) Information Processing and Management , vol.44 , Issue.2 , pp. 702-737
    • Santini, M.1
  • 28
    • 0002442796 scopus 로고    scopus 로고
    • Machine learning in automated text categorization
    • Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys 34 1 (2002)
    • (2002) ACM Computing Surveys , vol.34 , Issue.1
    • Sebastiani, F.1
  • 30
    • 4944243732 scopus 로고    scopus 로고
    • A local maxima method and a fair dispersion normalization for extracting multiword units
    • Silva, J., & Lopes, G. (1999). A local maxima method and a fair dispersion normalization for extracting multiword units. In Proceedings of the 6th meeting on the mathematics of language (pp. 369-381).
    • (1999) Proceedings of the 6th meeting on the mathematics of language , pp. 369-381
    • Silva, J.1    Lopes, G.2
  • 31
    • 84957628231 scopus 로고    scopus 로고
    • Using localmaxs algorithm for the extraction of contiguous and non-contiguous multiword lexical units
    • Springer
    • Silva, J., Dias, G., Guilloré, S., & Lopes, G. (1999). Using localmaxs algorithm for the extraction of contiguous and non-contiguous multiword lexical units. Lecture notes on artificial intelligence, 1695 (pp. 113-132). Springer.
    • (1999) Lecture notes on artificial intelligence, 1695 , pp. 113-132
    • Silva, J.1    Dias, G.2    Guilloré, S.3    Lopes, G.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.