出版社:University of Malaya * Faculty of Computer Science and Information Technology
摘要:The innovation of AJAX resulted in more responsive, interactive and faster web applications due to the clever amalgamation of JavaScript, HTML, and Cascading Style Sheets (CSS). However, from the user’s perspective, this achievement places many challenges before web search engines. One major challenge is due to the complexities in crawling such web applications because multiple states are associated with one uniform resource locator (URL) that cause a mismatch with search model of web search engines, where a web document is uniquely identified by a single unique URL with a single state. Crawling AJAXbased web applications means giving strength and capability to web search engines so that information produced in these highlyinteractive web applications is downloaded and indexed. The need here is to investigate the technicalities of AJAX that shatter the metaphor of a web page which the current web search engine utilize during crawling in order to improve the capabilities of web search engines. Although some academic tools have been developed, they produce some false positives which greatly affect the performance of web search engine. We aim to investigate AJAX and AJAXbased web applications as well as the stateoftheart in crawling these applications along with some prominent issues, challenges and recommendations.
关键词:AJAX; Crawling; Document Object Model (DOM); Information Retrieval