메뉴 건너뛰기




Volumn 50, Issue 3, 2011, Pages 345-356

An efficient approach of noise removal from web page for effectual web content mining

Author keywords

Duplicate blocks; Keyword redundancy; Linkword percentage; Titleword relevancy; Web cleaning; Web content mining; Web mining

Indexed keywords


EID: 79953805803     PISSN: 1450216X     EISSN: 1450202X     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (9)

References (28)
  • 3
    • 33644531525 scopus 로고    scopus 로고
    • Mining Web Content Outliers using Structure Oriented Weighting Techniques and N-Grams
    • New Mexico, March
    • Malik Agyemang, Ken Barker and Rada S. Alhajj, "Mining Web Content Outliers using Structure Oriented Weighting Techniques and N-Grams", In Proceedings of the ACM Annual Symposium on Applied Computing, pp. 482-487, New Mexico, March 2005.
    • (2005) Proceedings of the ACM Annual Symposium on Applied Computing , pp. 482-487
    • Malik, A.1    Barker, K.2    Alhajj, R.S.3
  • 8
    • 0001781295 scopus 로고    scopus 로고
    • Web mining research: A survey
    • Raymond Kosala and Hendrik Blockeel, "Web Mining Research: A Survey", ACM SIGKDD Explorations, Vol. 2, No.1, pp. 1-15, 2000.
    • (2000) ACM SIGKDD Explorations , vol.2 , Issue.1 , pp. 1-15
    • Kosala, R.1    Blockeel, H.2
  • 9
    • 78449236991 scopus 로고    scopus 로고
    • Performance modeling of a distributed web crawler using stochastic activity networks
    • Springer, Verlag, ISSN: 1865-0929
    • Mitra Nasri, Saeed Shariati and Mohammad Abdollahi Azgomi, "Performance Modeling of a Distributed Web Crawler using Stochastic Activity Networks", Communications in Computer and Information Science (CCIS), Springer-Verlag, Vol. 9, pp.535-542, ISSN: 1865-0929, 2008.
    • (2008) Communications in Computer and Information Science (CCIS) , vol.9 , pp. 535-542
    • Nasri, M.1    Shariati, S.2    Abdollahi Azgomi, M.3
  • 10
  • 13
    • 38049108412 scopus 로고    scopus 로고
    • The characteristic analysis of web user clusters based on frequent browsing patterns
    • Zhiwang Zhang and Yong Shi, "The Characteristic Analysis of Web User Clusters Based on Frequent Browsing Patterns", Lecture Notes in Computer Science, Vol. 4488, pp. 490-493, 2007.
    • (2007) Lecture Notes in Computer Science , vol.4488 , pp. 490-493
    • Zhang, Z.1    Shi, Y.2
  • 15
    • 8844245490 scopus 로고    scopus 로고
    • A method of eliminating noises in web pages by style tree model and its applications
    • Zhao Cheng-li and Yi Dong-yun, "A Method of Eliminating Noises in Web Pages by Style Tree Model and Its Applications", Wuhan University Journal of Natural Sciences, Vol.9, No.5, pp. 611-616, 2004.
    • (2004) Wuhan University Journal of Natural Sciences , vol.9 , Issue.5 , pp. 611-616
    • Cheng-Li, Z.1    Dong-Yun, Y.2
  • 19
    • 26444532019 scopus 로고    scopus 로고
    • Learning important models for web page blocks based on layout and content analysis
    • Ruihua Song, Haifeng Liu, Ji-Rong Wen and Wei-Ying Ma, "Learning Important Models for Web Page Blocks based on Layout and Content Analysis", ACM SIGKDD Explorations Newsletter, Vol. 6, No. 2, pp. 14-23, 2004.
    • (2004) ACM SIGKDD Explorations Newsletter , vol.6 , Issue.2 , pp. 14-23
    • Song, R.1    Liu, H.2    Wen, J.-R.3    Ma, W.-Y.4
  • 20
    • 79953811061 scopus 로고    scopus 로고
    • Noise elimination from the web documents by using url paths and information redundancy
    • June 26-29, Las Vegas, Nevada, US
    • Byeong Ho Kang and Yang Sok Kim, "Noise Elimination from The Web Documents By Using URL Paths and Information Redundancy", In Proceedings of the International Conference on Information & Knowledge Engineering, pp. 135-141, June 26-29, Las Vegas, Nevada, US, 2006.
    • (2006) Proceedings of the International Conference on Information & Knowledge Engineering , pp. 135-141
    • Kang, B.H.1    Sok Kim, Y.2
  • 24
    • 33846539282 scopus 로고    scopus 로고
    • Mining key information of web pages: A method and its application
    • DOI 10.1016/j.eswa.2006.05.017, PII S0957417406001588
    • Chao Wang, Jie Lua, and Guangquan Zhanga, "Mining Key Information of Web Pages: A Method and Its Application", Expert Systems with Applications, Vol.33, No.2, pp.425-433, August 2007. (Pubitemid 46157268)
    • (2007) Expert Systems with Applications , vol.33 , Issue.2 , pp. 425-433
    • Wang, C.1    Lu, J.2    Zhang, G.3
  • 27
    • 35348911985 scopus 로고    scopus 로고
    • Detecting near-duplicates for web crawling
    • DOI 10.1145/1242572.1242592, 16th International World Wide Web Conference, WWW2007
    • Gurmeet Singh Manku, Arvind Jain and Anish Das Sarma, "Detecting Near-Duplicates for Web Crawling", In Proceedings of the 16th International Conference on World Wide Web, pp. 141-150, May 8-12, Banff, Alberta, Canada, 2007. (Pubitemid 47582246)
    • (2007) 16th International World Wide Web Conference, WWW2007 , pp. 141-150
    • Manku, G.S.1    Jain, A.2    Das Sarma, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.