Data protection regulations such as the General Data Protection Regulation (GDPR) in the European Union and the California Consumer Privacy Act (CCPA) in the US affect how software may handle the personal data of its users. Prior literature focused on how data protection regulations are discussed for software in operation, or how this topic is discussed in various channels outside of the software development process. Yet, what is missing, is a perspective on the impact of such regulations on the software development process. In our work, we address this gap, and explore how discussions during the development of software are impacted by regulations, who reports and discusses issues related to personal data and data protection, and how developers react to those issues. To that end, we used inductive coding to analyze 652 issues from Open Source GitHub projects and used the codes to quantitatively analyze the relation between the roles, resolutions, and data protection issues to understand correlations and predict resolutions of issues. Most notably we observed a significant increase in reporting when GDPR came into effect. The most common issue types were feature requests for privacy enhancement, which were mainly reported and discussed by frequent reporters and frequent committers. But especially issues regarding privacy enhancement were also frequently reported by one-time reporters. Most of the requests were solved without opposing votes. All in all, our findings indicate that data protection regulations effectively start discussions about privacy within the software development community.
翻译:欧盟《通用数据保护条例》(GDPR)和美国《加州消费者隐私法案》(CCPA)等数据保护法规影响着软件处理用户个人数据的方式。现有文献主要关注运营阶段软件对数据保护法规的讨论,或软件开发流程外其他渠道对该议题的探讨。然而,当前研究缺乏关于此类法规对软件开发过程影响的视角。本研究通过分析652个开源GitHub项目中的议题,采用归纳编码方法探究以下问题:软件开发过程中的讨论如何受法规影响、由谁报告和讨论个人数据与数据保护相关问题、开发者如何应对这些问题。我们运用编码对角色、解决方案与数据保护议题之间的关系进行定量分析,以理解其相关性并预测问题解决方案。最显著的发现是GDPR生效后相关报告数量显著增加。最常见的议题类型是隐私增强功能请求,主要由高频报告者和高频提交者报告讨论,但隐私增强类议题也常被一次性报告者提及。大多数请求在无反对意见的情况下得到解决。总体而言,我们的研究表明数据保护法规有效推动了软件开发社区内部关于隐私的讨论。