首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:Persian Text Classification Enhancement by Latent Semantic Space
  • 本地全文:下载
  • 作者:Mohammad Bagher Dastgheib ; Sara Koleini
  • 期刊名称:INTERNATIONAL JOURNAL OF INFORMATION SCIENCE AND MANAGEMENT
  • 印刷版ISSN:2008-8302
  • 电子版ISSN:2008-8310
  • 出版年度:2019
  • 卷号:17
  • 期号:1
  • 语种:English
  • 出版社:REGIONAL INFORMATION CENTER FOR SCIENCE AND TECHNOLOGY
  • 摘要:Heterogeneous data in all groups are growing on the web nowadays. Because of the variety of data types in the web search results, it is common to classify the results in order to find the preferred data. Many machine learning methods are used to classify textual data. The main challenges in data classification are the cost of classifier and performance of classification. A traditional model in IR and text data representation is the vector space model. In this representation cost of computations are dependent upon the dimension of the vector. Another problem is to select effective features and prune unwanted terms. Latent semantic indexing is used to transform VSM to orthogonal semantic space with term relation consideration. Experimental results showed that LSI semantic space can achieve better performance in computation time and classification accuracy. This result showed that semantic topic space has less noise so the accuracy will increase. Less vector dimension also reduces the computational complexity.
国家哲学社会科学文献中心版权所有