Anonymity of both natural and legal persons in court rulings is a critical aspect of privacy protection in the European Union and Switzerland. With the advent of LLMs, concerns about large-scale re-identification of anonymized persons are growing. In accordance with the Federal Supreme Court of Switzerland, we explore the potential of LLMs to re-identify individuals in court rulings by constructing a proof-of-concept using actual legal data from the Swiss federal supreme court. Following the initial experiment, we constructed an anonymized Wikipedia dataset as a more rigorous testing ground to further investigate the findings. With the introduction and application of the new task of re-identifying people in texts, we also introduce new metrics to measure performance. We systematically analyze the factors that influence successful re-identifications, identifying model size, input length, and instruction tuning among the most critical determinants. Despite high re-identification rates on Wikipedia, even the best LLMs struggled with court decisions. The complexity is attributed to the lack of test datasets, the necessity for substantial training resources, and data sparsity in the information used for re-identification. In conclusion, this study demonstrates that re-identification using LLMs may not be feasible for now, but as the proof-of-concept on Wikipedia showed, it might become possible in the future. We hope that our system can help enhance the confidence in the security of anonymized decisions, thus leading to the courts being more confident to publish decisions.
翻译:在欧盟和瑞士,法院判决中自然人和法人的匿名性是隐私保护的关键方面。随着大型语言模型的出现,关于大规模重识别匿名化人员的担忧日益增加。依据瑞士联邦最高法院的指示,我们通过使用瑞士联邦最高法院的实际法律数据构建概念验证,探索了大型语言模型在法院判决中重识别个体的潜力。在初步实验之后,我们构建了一个匿名化的维基百科数据集,作为更严格的测试平台以进一步研究相关发现。通过引入并应用文本中人重识别的新任务,我们还引入了新的指标来衡量性能。我们系统分析了影响成功重识别的因素,确定了模型规模、输入长度和指令调优是其中最关键的决定因素。尽管在维基百科上实现了高重识别率,即使是最优的大型语言模型在处理法院判决时也面临困难。这种复杂性归因于测试数据集的缺乏、大量训练资源的必要性以及重识别所用信息的稀疏性。总之,本研究证明目前使用大型语言模型进行重识别可能并不可行,但正如维基百科上的概念验证所示,未来这有可能成为现实。我们希望我们的系统能够帮助增强对匿名化判决安全性的信心,从而使法院更愿意公布判决。