首页    期刊浏览 2025年03月02日 星期日
登录注册

文章基本信息

  • 标题:Database-Inspired Optimizations for Statistical Analysis
  • 本地全文:下载
  • 作者:Hannes Mühleisen ; Alexander Bertram ; Maarten-Jan Kallen
  • 期刊名称:Journal of Statistical Software
  • 印刷版ISSN:1548-7660
  • 电子版ISSN:1548-7660
  • 出版年度:2018
  • 卷号:87
  • 期号:1
  • 页码:1-20
  • DOI:10.18637/jss.v087.i04
  • 语种:English
  • 出版社:University of California, Los Angeles
  • 摘要:Computing complex statistics on large amounts of data is no longer a corner case, but a daily challenge. However, current tools such as GNU R were not built to efficiently handle large data sets. We propose to vastly improve the execution of R scripts by interpreting them as a declaration of intent rather than an imperative order set in stone. This allows us to apply optimization techniques from the columnar data management research field. We have implemented several of these optimizers in Renjin, an open-source execution environment for R scripts targeted at the Java virtual machine. The demonstration of our approach using a series of micro-benchmarks and experiments on complex survey analysis show orders-of-magnitude improvements in analysis cost.
  • 其他关键词:automatic optimization
国家哲学社会科学文献中心版权所有