首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:A Comparative Survey on Arabic Stemming: Approaches and Challenges
  • 本地全文:下载
  • 作者:Mohammad Mustafa ; Afag Salah Eldeen ; Sulieman Bani-Ahmad
  • 期刊名称:Intelligent Information Management
  • 印刷版ISSN:2150-8194
  • 电子版ISSN:2150-8208
  • 出版年度:2017
  • 卷号:09
  • 期号:02
  • 页码:39-67
  • DOI:10.4236/iim.2017.92003
  • 语种:English
  • 出版社:Scientific Research Publishing
  • 摘要:Arabic, as one of the Semitic languages, has a very rich and complex morphology, which is radically different from the European and the East Asian languages. The derivational system of Arabic, is therefore, based on roots, which are often inflected to compose words, using a spectacular and a relatively large set of Arabic morphemes affixes, e.g., antefixs, prefixes, suffixes, etc. Stemming is the process of rendering all the inflected forms of word into a common canonical form. Stemming is one of the early and major phases in natural processing, machine translation and information retrieval tasks. A number of Arabic language stemmers were proposed. Examples include light stemming, morphological analysis, statistical-based stemming, N-grams and parallel corpora (collections). Motivated by the reported results in the literature, this paper attempts to exhaustively review current achievements for stemming Arabic texts. A variety of algorithms are discussed. The main contribution of the paper is to provide better understanding among existing approaches with the hope of building an error-free and effective Arabic stemmer in the near future.
  • 关键词:Arabic Language;Light Stemming;Root-Based Stemming;Co-Occurrence;Artificial Intelligence Stemming
国家哲学社会科学文献中心版权所有