Ensuring compliance with the General Data Protection Regulation (GDPR) is a crucial aspect of software development. This task, due to its time-consuming nature and requirement for specialized knowledge, is often deferred or delegated to specialized code reviewers. These reviewers, particularly when external to the development organization, may lack detailed knowledge of the software under review, necessitating the prioritization of their resources. To address this, we have designed two specialized views of a codebase to help code reviewers in prioritizing their work related to personal data: one view displays the types of personal data representation, while the other provides an abstract depiction of personal data processing, complemented by an optional detailed exploration of specific code snippets. Leveraging static analysis, our method identifies personal data-related code segments, thereby expediting the review process. Our approach, evaluated on four open-source GitHub applications, demonstrated a precision rate of 0.87 in identifying personal data flows. Additionally, we fact-checked the privacy statements of 15 Android applications. This solution, designed to augment the efficiency of GDPR-related privacy analysis tasks such as the Record of Processing Activities (ROPA), aims to conserve resources, thereby saving time and enhancing productivity for code reviewers.
翻译:确保遵守《通用数据保护条例》(GDPR)是软件开发的关键环节。由于该任务耗时且需要专业知识,通常会被延期或委托给专门的代码审查员。这些审查员(尤其是开发组织外部的审查员)可能缺乏对所审查软件的详细认知,因此需要优先分配其资源。为此,我们设计了两种专门化的代码库视图,帮助审查员优先处理与个人数据相关的工作:一种视图展示个人数据表示的类型,另一种则提供个人数据处理的抽象描述,并辅以特定代码片段的可选详细探索。通过利用静态分析技术,我们的方法能够识别与个人数据相关的代码片段,从而加速审查流程。在四个开源GitHub应用上的评估表明,该方法识别个人数据流程的精确率达到0.87。此外,我们还对15个Android应用的隐私声明进行了事实核查。该解决方案旨在提升GDPR相关隐私分析任务(如处理活动记录ROPA)的效率,通过节约资源来节省时间并提高代码审查员的生产力。