首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Research on Deep Web Query Interface Clustering Based on Hadoop
  • 本地全文:下载
  • 作者:Qiang, Baohua ; Zhang, Rui ; Wang, Yufeng
  • 期刊名称:Journal of Software
  • 印刷版ISSN:1796-217X
  • 出版年度:2014
  • 卷号:9
  • 期号:12
  • 页码:3057-3062
  • DOI:10.4304/jsw.9.12.3057-3062
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:How to cluster different query interfaces effectively is one of the most core issues when generating integrated query interface on Deep Web integration domain. However, with the rapid development of Internet technology, the number of Deep Web query interface shows an explosive growth trend. For this reason, the traditional stand-alone Deep Web query interface clustering approaches encounter bottlenecks in terms of time complexity and space complexity. After further study of the Hadoop distributed platforms and Map Reduce programming model, a Deep Web query interface clustering algorithm based on Hadoop platform is designed and implemented, in which the Vector Space Model (VSM) and Latent Semantic Analysis (LSA) are employed to represent “Query Interfaces-Attributes” relationships. The experimental results show that the proposed algorithm has better scalability and speedup ratio by using Hadoop architecture.
  • 关键词:Hadoop;Map Reduce;Deep Web;LSA;Query Interface Clustering
国家哲学社会科学文献中心版权所有