文章基本信息

标题：URL Ordering based Performance Evaluation of Web Crawler
本地全文：下载
作者：Mohd Adil Siddiqui ; Sudheer Kumar Singh
期刊名称：International Journal of Computer and Information Technology
印刷版ISSN：2279-0764
出版年度：2015
卷号：4
期号：1
出版社：International Journal of Computer and Information Technology
摘要：There are billions of Web pages on World Wide Web which can be accessed via internet. All of us rely on usage of internet for source of information. This source of information is available on web in various forms such as Websites, databases, images, sound, videos and many more. The search results given by search engine are classified on basis of many techniques such as keyword matches, link analysis, or many other techniques. Search engines provide information gathered from their own indexed databases. These indexed databases contain downloaded information from web pages. Whenever a query is provided by user, the information is fetched from these indexed pages. The Web Crawler is used to download and store web pages. Web crawler of these search engines is expert in crawling various Web pages to gather huge source of information. Web Crawler is developed which orders URLs on the basis of their content similarity to a query and structural similarity. Results are provided over five parameters: Top URLs, Precision, Content, Structural and Total Similarity for a keyword.
关键词：Web Crawler; URL Ordering; Web Pages