首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:A Novel Architecture of Agent based Crawling for OAI Resources
  • 本地全文:下载
  • 作者:Shruti Sharma ; J.P.Gupta
  • 期刊名称:International Journal on Computer Science and Engineering
  • 印刷版ISSN:2229-5631
  • 电子版ISSN:0975-3397
  • 出版年度:2010
  • 卷号:2
  • 期号:4
  • 页码:1190-1195
  • 出版社:Engg Journals Publications
  • 摘要:Nowadays, most of the search engines are competing to index as much of the Surface Web as possible with leaving a lurch at the OAI content (pdf documents), which holds a huge amount of information than surface web. In this paper, a novel framework for OAI-PMH based Crawler is being proposed that uses agents to extract the metadata about the OAI resources and store them in a repository which is later on queried through the OAI-PMH layer to generate the XML pages containing the metadata. These pages are further added to the search engines repository for indexing that makes in turn increases the relevancy of Search Engine. Agents are being used to parallelize the whole process so that metadata extraction from multiple resources can be carried out simultaneously.
  • 关键词:OAI-PMH; Agents; Surface web;Hidden Web.
国家哲学社会科学文献中心版权所有