首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Sentence Alignment using MR and GA
  • 本地全文:下载
  • 作者:Mohamed Abdel Fattah
  • 期刊名称:Computer Engineering and Intelligent Systems
  • 印刷版ISSN:2222-1727
  • 电子版ISSN:2222-2863
  • 出版年度:2011
  • 卷号:2
  • 期号:2
  • 页码:17-28
  • 语种:English
  • 出版社:International Institute for Science, Technology Education
  • 摘要:In this paper, two new approaches to align English-Arabic sentences in bilingual parallel corpora based on mathematical regression (MR) and genetic algorithm (GA) classifiers are presented. A feature vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuation score, and cognate score values. A set of manually prepared training data was assigned to train the mathematical regression and genetic algorithm models. Another set of data was used for testing. The results of (MR) and (GA) outperform the results of length based approach. Moreover these new approaches are valid for any languages pair and are quite flexible since the feature vector may contain more, less or different features, such as a lexical matching feature and Hanzi characters in Japanese-Chinese texts, than the ones used in the current research.
国家哲学社会科学文献中心版权所有