摘要:Information access is one of the hottest topics of information society, which has become even more important since the advent of the Web, but nowadays the general Web search engines still have no ability to find correct and timely information for individuals. In this paper, we propose a Peer-to-Peer (P2P) based decentralized focused Web crawling system called PeerBridge to provide user-centered, content-sensitive and personalized information search service from Web. The PeerBridge is built on the foundation of our previous work about WebBridge, which is a focused crawling system to crawl Web according several specified topic. The most important function of PeerBridge is to identify interesting information. So we furthermore present an efficient personalized information filter in detail, which combines several component neural networks to accomplish the filtering task. Performance evaluation in the experiments showed that PeerBridge is effective to crawl relevant information for specific topics and the information filter is efficient, which precision is better than that of support vector machine, naïve bayesian and individual neural network.
关键词:PeerBridge; web crawling system; P2P based; artificial neural network