首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Performance Analysis for Mining Images of Deep Web
  • 其他标题:Performance Analysis for Mining Images of Deep Web
  • 本地全文:下载
  • 作者:Ily Amalina Ahmad Sabri ; Mustafa Man
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2020
  • 卷号:11
  • 期号:10
  • DOI:10.14569/IJACSA.2020.0111001
  • 出版社:Science and Information Society (SAI)
  • 摘要:In this paper, advancing web scale knowledge extraction and alignment by integrating few sources has been considered by exploring different methods of aggregation and attention in order to focus on image information. An improved model, namely, Wrapper Extraction of Image using DOM and JSON (WEIDJ) has been proposed to extract images and the related information in fastest way. Several models, such as Document Object Model (DOM), Wrapper using Hybrid DOM and JSON (WHDJ), WEIDJ and WEIDJ (no-rules) are been discussed. The experimental results on real world websites demonstrate that our models outperform others, such as Document Object Model (DOM), Wrapper using Hybrid DOM and JSON (WHDJ) in terms of mining in a higher volume of web data from a various types of image format and taking the consideration of web data extraction from deep web.
  • 关键词:Data extraction; Document Object Model; web data extraction; Wrapper using Hybrid DOM and JSON; Wrapper Extraction of Image using DOM and JSON
国家哲学社会科学文献中心版权所有