首页    期刊浏览 2024年12月05日 星期四
登录注册

文章基本信息

  • 标题:Supporting Data Provenance in Data-Intensive Scalable Computing Systems
  • 本地全文:下载
  • 作者:Matteo Interlandi ; Tyson Condie
  • 期刊名称:Bulletin of the Technical Committee on Data Engineering
  • 出版年度:2018
  • 卷号:41
  • 期号:1
  • 页码:63
  • 出版社:IEEE Computer Society
  • 摘要:Debugging data processing logic in Data-Intensive Scalable Computing (DISC) systems is a difficult andtime consuming effort. Data provenance support is a key building block in libraries that aim to providedebugging support for data processing pipelines. In this paper we report our experience in buildingTitian: a data provenance system targeting the Apache Spark framework. Our focus here is to analyzethe design choices and trade offs that we and others made. Ultimately, we believe there is still more workto do before reaching a widespread adoption of data provenance outside the research community.
国家哲学社会科学文献中心版权所有