首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:Measuring Semantic Similarity between Words Using Page Counts and Snippets
  • 本地全文:下载
  • 作者:Manasa.Ch ; V.Ramana ; S.P. Ananda Raj
  • 期刊名称:International Journal of Computer Science and Communication Networks
  • 电子版ISSN:2249-5789
  • 出版年度:2012
  • 卷号:2
  • 期号:4
  • 页码:553-558
  • 出版社:Technopark Publications
  • 摘要:Web mining involves activities such as document clustering, community mining etc. to be performed on web. Such tasks need measuring semantic similarity between words. This helps in performing web mining activities easily in many applications. However, the accuracy of measuring semantic similarity between any two words is difficult task. In this paper a new approach is proposed to measure similarity between words. This approach is based on text snippets and page counts. These two measures are taken from the results of a search engine like Google. To achieve the aim of this paper, lexical patterns are extracted from text snippets and word co-occurrence measures are defined using page counts. The results of these two are combined. Moreover, we proposed algorithms such as pattern clustering and pattern extraction in order to find various relationships between any given two words. Support Vector Machines, a data mining technique, is used to optimize the results. The empirical results reveal that the proposed techniques are finding best results that can be compared with human ratings and accuracy in web mining activities
  • 关键词:Text snippets; word count; semantic similarity; web mining; lexical patterns
国家哲学社会科学文献中心版权所有