首页    期刊浏览 2024年12月17日 星期二
登录注册

文章基本信息

  • 标题:A systematic data collection procedure for software defect prediction
  • 本地全文:下载
  • 作者:Mauša, Goran ; Galinac-Grbac, Tihana ; Dalbelo-Bašić, Bojana
  • 期刊名称:Computer Science and Information Systems
  • 印刷版ISSN:1820-0214
  • 电子版ISSN:2406-1018
  • 出版年度:2016
  • 页码:61-61
  • DOI:10.2298/CSIS141228061M
  • 出版社:ComSIS Consortium
  • 摘要:Software defect prediction research relies on data that must be collected from otherwise separate repositories. To achieve greater generalization of the results, standardized protocols for data collection and validation are necessary. This paper presents an exhaustive survey of techniques and approaches used in the data collection process. It identifies some of the issues that must be addressed to minimize dataset bias and also provides a number of measures that can help researchers to compare their data collection approaches and evaluate their data quality. Moreover, we present a data collection procedure that uses a bug-code linking technique based on regular expression. The detailed comparison and root cause analysis of inconsistencies with a number of popular data collection approaches and their publicly available datasets, reveals that our procedure achieves the most favorable results. Finally, we implement our data collection procedure in a data collection tool we name the Bug-Code (BuCo) Analyzer.
  • 关键词:software defect prediction; data collection issues; dataset bias; bug-code linking; open-source projects
国家哲学社会科学文献中心版权所有