In this paper, we will discuss on the strategic issues of exploring, harvesting and integrating the Deep Web. We will then develop a novel interdisciplinary stochastic model for a Deep Web Search Engine which can detect and rank the contents optimally. Our efforts aim at opening up to users by building Generator. On this information grand voyage, the Generator will address the challenges of exploring, harvesting and integrating of the Deep Web. First, to make the Web systematically accessible: our Generator will focus on the discovery, modeling and structuring of databases on the Web to develop a search engine, in order to help users find sources useful for their information need. Second, to make the Web uniformly usable: the Generator will help users to make optimal choice of keywords. Based on these insights, we design a stochastic model and employ an interdisciplinary approach consisting stochastic and optimization techniques. In addition, we consider three types of keywords: text-based, image-based and hybrid-based. Experimental and simulation results are given for illustrations.
Deep Web, Generator, search engine, stochastic model, interdisciplinary approach