文章基本信息

标题：Web Crawler: Extracting the Web Data
本地全文：下载
作者：Mini Singh Ahuja ; Dr Jatinder Singh Bal ; Varnica 等
期刊名称：International Journal of Computer Trends and Technology
电子版ISSN：2231-2803
出版年度：2014
卷号：13
期号：3
页码：132-137
DOI：10.14445/22312803/IJCTT-V13P128
出版社：Seventh Sense Research Group
摘要：Internet usage has increased a lot in recent times. Users can find their resources by using different hypertext links. This usage of Internet has led to the invention of web crawlers. Web crawlers are full text search engines which assist users in navigating the web. These web crawlers can also be used in further research activities. For e.g. the crawled data can be used to find missing links, community detection in complex networks. In this paper we have reviewed web crawlers their architecture, types and various challenges being faced when search engines use the web crawlers.
关键词：web crawler; blind traversal algorithms; best first heuristic algorithms etc.