Privacy policies are expected to inform data subjects about their data protection rights. They should explain the data controller's data management practices, and make facts such as retention periods or data transfers to third parties transparent. Privacy policies only fulfill their purpose, if they are correctly perceived, interpreted, understood, and trusted by the data subject. Amongst others, this requires that a privacy policy is written in a fair way, e.g., it does not use polarizing terms, does not require a certain education, or does not assume a particular social background. In this work-in-progress paper, we outline our approach to assessing fairness in privacy policies. To this end, we identify from fundamental legal sources and fairness research, how the dimensions informational fairness, representational fairness and ethics/morality are related to privacy policies. We propose options to automatically assess policies in these fairness dimensions, based on text statistics, linguistic methods and artificial intelligence. Finally, we conduct initial experiments with German privacy policies to provide evidence that our approach is applicable. Our experiments indicate that there are indeed issues in all three dimensions of fairness. For example, our approach finds out if a policy discriminates against individuals with impaired reading skills or certain demographics, and identifies questionable ethics. This is important, as future privacy policies may be used in a corpus for legal artificial intelligence models.
翻译:隐私政策旨在告知数据主体其数据保护权利。它们应解释数据控制者的数据管理实践,并使其中的事实(如保留期限或向第三方传输数据)透明化。隐私政策只有被数据主体正确感知、解读、理解和信任,才能实现其目的。这要求隐私政策以公平的方式撰写,例如,不使用极化术语、不要求特定教育水平、或不预设特定社会背景。在这篇进展性论文中,我们概述了评估隐私政策公平性的方法。为此,我们从基本法律来源和公平性研究中识别出信息公平性、代表性公平性和伦理/道德这三个维度如何与隐私政策相关。我们提出了基于文本统计、语言方法和人工智能来自动评估政策在这些公平性维度的选项。最后,我们使用德语隐私政策进行了初步实验,以证明我们的方法具有可行性。实验表明,这三个公平性维度确实存在问题。例如,我们的方法能检测政策是否歧视阅读能力受损的个体或特定人群,并识别出有问题的伦理倾向。这一点很重要,因为未来的隐私政策可能被用于法律人工智能模型的语料库。