Making sure that users understand privacy policies that impact them is a key challenge for a real GDPR deployment. Research studies are mostly carried in English, but in Europe and elsewhere, users speak a language that is not English. Replicating studies in different languages requires the availability of comparable cross-language privacy policies corpora. This work provides a methodology for building comparable cross-language in a national language and a reference study language. We provide an application example of our methodology comparing English and Italian extending the corpus of one of the first studies about users understanding of technical terms in privacy policies. We also investigate other open issues that can make replication harder.
翻译:确保用户理解影响自身的隐私政策,是真正落实《通用数据保护条例》(GDPR)的关键挑战。现有研究大多以英语进行,但在欧洲及其他地区,用户使用的语言并非英语。在不同语言中复制研究需要具备可比较的跨语言隐私政策语料库。本研究提出了一种方法,用于构建以本国语言与参考研究语言为对象的可比跨语言语料库。我们通过对比英语和意大利语的实例展示了该方法的实际应用,扩展了关于用户理解隐私政策技术术语的首批研究之一的语料库。此外,我们还探讨了可能增加研究复制难度的其他未决问题。