期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN:2277-6451
电子版ISSN:2277-128X
出版年度:2013
卷号:3
期号:6
出版社:S.S. Mishra
摘要:In this research paper we explore the various developments that have occurred to build crawler that feed the search engines. After systematic literature review of algorithms related to information retrieval, we have found that most of the search engines became irrelevant in terms of their results as internet grew, and the challenge remains as fresh as ever in developing algorithm that can have high precision and recall values. Since all search engines take their data fed using crawlers, it is critical to improve its working. Now, due to size Big Data Generic Crawlers are no longer applicable in real life. So there is an urgent need to develop a domain specific crawler built on stock of existing algorithms like LSI so that they become relevant again, the paper proposes such domain specific crawler algorithm.