首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:Efficient Word Alignment with Markov Chain Monte Carlo
  • 本地全文:下载
  • 作者:Robert Östling ; Jörg Tiedemann
  • 期刊名称:The Prague Bulletin of Mathematical Linguistics
  • 印刷版ISSN:0032-6585
  • 电子版ISSN:1804-0462
  • 出版年度:2016
  • 卷号:106
  • 期号:1
  • 页码:125-146
  • DOI:10.1515/pralin-2016-0013
  • 语种:English
  • 出版社:Walter de Gruyter GmbH
  • 摘要:We present EFMARAL, a new system for efficient and accurate word alignment using a Bayesian model with Markov Chain Monte Carlo (MCMC) inference. Through careful selection of data structures and model architecture we are able to surpass the fast_align system, commonly used for performance-critical word alignment, both in computational efficiency and alignment accuracy. Our evaluation shows that a phrase-based statistical machine translation (SMT) system produces translations of higher quality when using word alignments from EFMARAL than from fast_align, and that translation quality is on par with what is obtained using GIZA++, a tool requiring orders of magnitude more processing time. More generally we hope to convince the reader that Monte Carlo sampling, rather than being viewed as a slow method of last resort, should actually be the method of choice for the SMT practitioner and others interested in word alignment.
国家哲学社会科学文献中心版权所有