The development of tools and techniques to analyze and extract organizations data habits from privacy policies are critical for scalable regulatory compliance audits. Unfortunately, these tools are becoming increasingly limited in their ability to identify compliance issues and fixes. After all, most were developed using regulation-agnostic datasets of annotated privacy policies obtained from a time before the introduction of landmark privacy regulations such as EUs GDPR and Californias CCPA. In this paper, we describe the first open regulation-aware dataset of expert-annotated privacy policies, C3PA (CCPA Privacy Policy Provision Annotations), aimed to address this challenge. C3PA contains over 48K expert-labeled privacy policy text segments associated with responses to CCPA-specific disclosure mandates from 411 unique organizations. We demonstrate that the C3PA dataset is uniquely suited for aiding automated audits of compliance with CCPA-related disclosure mandates.
翻译:开发用于分析和从隐私政策中提取组织数据习惯的工具与技术,对于实现可扩展的法规遵从性审计至关重要。然而,这些工具在识别合规性问题及修复措施方面的能力正日益受限。究其原因,大多数工具是使用法规无关的标注隐私政策数据集开发的,这些数据集获取于欧盟《通用数据保护条例》(GDPR)和加州《消费者隐私法案》(CCPA)等具有里程碑意义的隐私法规出台之前的时期。本文中,我们描述了首个开放的、法规感知的专家标注隐私政策数据集——C3PA(CCPA隐私政策条款标注),旨在应对这一挑战。C3PA包含超过48,000个由专家标注的隐私政策文本片段,这些片段关联着来自411个独特组织对CCPA特定披露要求的回应。我们证明,C3PA数据集特别适用于辅助对CCPA相关披露要求合规性的自动化审计。