Background: The "Technical Debt Dataset" (TDD) is a comprehensive dataset on technical debt (TD) in the main branches of more than 30 Java projects. However, some TD items produced by SonarQube are not included for many commits, for instance because the commits failed to compile. This has limited previous studies using the dataset. Aims and Method: In this paper, we provide an addition to the dataset that includes an analysis of 278,320 commits of all branches in a superset of 37 projects using Teamscale. We then demonstrate the utility of the dataset by exploring the relationship between developer personality by replicating a prior study. Results: The new dataset allows us to use a larger sample than prior work could, and we analyze the personality of 111 developers and 5,497 of their commits. The relationships we find between developer personality and the introduction and removal of TD differ from those found in prior work. Conclusions: We offer a dataset that may enable future studies into the topic of TD and we provide additional insights on how developer personality relates to TD.
翻译:背景:"技术债务数据集"(TDD)是一个涵盖30多个Java项目主分支技术债务(TD)的综合数据集。然而,SonarQube生成的某些TD项并未包含在多次提交中,原因包括提交编译失败等。这限制了此前该数据集的应用范围。目的与方法:本文对该数据集进行了扩展,通过Teamscale工具分析了包含37个项目的超集所有分支中的278,320次提交。随后,我们通过复现先前研究来探索开发者人格与技术债务的关系,以验证数据集的实用性。结果:新数据集使我们能够使用比先前研究更大的样本量,我们分析了111位开发者及其5,497次提交的人格特征。研究发现,开发者人格与TD引入和消除之间的关系与先前研究结果存在差异。结论:我们提供了一个可能促进未来TD研究的数据集,并就开发者人格如何影响TD提供了新的见解。