摘要:A data warehouse is a collection of data gathered and organized so that it can easily be analyzed, extracted, synthesized and also be used for the purpose of further understanding data. Peer to Peer networks are used for distribution and sharing of documents. In traditional techniques, when aggregate functions like average, sum and count are encountered, the aggregate operation is performed by considering all the nodes and tuples, which reduces the efficiency of the query processing system. Exact solutions can be time consuming and difficult to implement, given the distributed and dynamic nature of P2P networks. The problem is overcome in this project by selecting random peers and random tuples from P2P networks and then performing the aggregation operation, thus the speed is increased and latency is reduced. Though accuracy is compromised to small extent, efficiency is achieved. Thus, this kind of approximate query processing will be beneficial to the areas where efficiency plays a main role than accuracy. Adaptive Hybrid approach based on random walk is used to achieve the efficiency in the performance of aggregation operation.
关键词:Aggregate Function; Peer to Peer Networks; Distributed Databases; Distributed Database Query Processing and Gossiping