首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Focused Crawler Optimization Using Genetic Algorithm
  • 本地全文:下载
  • 作者:Banu Wirawan Yohanes ; Handoko Handoko ; Hartanto Kusuma Wardana
  • 期刊名称:TELKOMNIKA (Telecommunication Computing Electronics and Control)
  • 印刷版ISSN:2302-9293
  • 出版年度:2011
  • 卷号:9
  • 期号:3
  • 页码:403-410
  • DOI:10.12928/telkomnika.v9i3.730
  • 语种:English
  • 出版社:Universitas Ahmad Dahlan
  • 摘要:As the size of the Web continues to grow, searching it for useful information has become more difficult. Focused crawler inten d s to explore the Web conform to a specific topic. This paper discusses the problems caused by local searching algorithms. Crawler can be trapped within a limited Web community and overlook suitable Web pages outside its track. A genetic algorithm as a global searching algorithm is modified to address the problems. The genetic algorithm is used to optimize Web crawling and to select more suitable Web pages to be fetched by the crawler. Several evaluation experiments are conducted to examine the effectiveness of the approach. The crawler delivers collections consist of 3396 Web pages from 5390 links which had been visited, or filtering rate of Roulette-Wheel selection at 63% and precision level at 93% in 5 different categories. The result showed that the utilization of genetic algorithm had empowered focused crawler to traverse the Web comprehensively, despite it relatively small collections. Furthermore, it brought up a great potential for building a n exemplary collections compared to traditional focused crawling methods .
国家哲学社会科学文献中心版权所有