Industry can get any research it wants, just by publishing a baseline result along with the data and scripts need to reproduce that work. For instance, the paper ``Data Mining Static Code Attributes to Learn Defect Predictors'' presented such a baseline, using static code attributes from NASA projects. Those result were enthusiastically embraced by a software engineering research community, hungry for data. At its peak (2016) this paper was SE's most cited paper (per month). By 2018, twenty percent of leading TSE papers (according to Google Scholar Metrics), incorporated artifacts introduced and disseminated by this research. This brief note reflects on what we should remember, and what we should forget, from that paper.
翻译:工业界只需发布基准结果及复现该工作所需的数据与脚本,即可获得所需的任何研究成果。例如,论文《通过挖掘静态代码属性学习缺陷预测器》正是提供了这样一个基准,其利用了NASA项目中的静态代码属性。这些结果被渴求数据的软件工程研究界热情接纳。在高峰期(2016年),该论文成为软件工程领域月度被引次数最高的文献。截至2018年,根据谷歌学术指标统计,20%的顶级TSE论文都吸收并推广了该研究提出的技术成果。本短评旨在反思我们应从该论文中铭记哪些贡献,又应摒弃哪些观点。