The latest data protection regulations worldwide, such as the General Data Protection Regulation (GDPR), have established the Right of Access by the Data Subject (RADS), granting users the right to access and obtain a copy of their personal data from the data controllers. This clause can effectively compel data controllers to handle user personal data more cautiously, which is of significant importance for protecting user privacy. However, there is currently no research systematically examining whether RADS has been effectively implemented in mobile apps, which are the most common personal data controllers. In this study, we propose a compliance measurement framework for RADS in apps. In our framework, we first analyze an app's privacy policy text using NLP techniques such as GPT-4 to verify whether it clearly declares offering RADS to users and provides specific details on how the right can be exercised. Next, we assess the authenticity and usability of the identified implementation methods by submitting data access requests to the app. Finally, for the obtained data copies, we further verify their completeness by comparing them with the user personal data actually collected by the app during runtime, as captured by Frida Hook. We analyzed a total of 1,631 apps in the American app market G and the Chinese app market H. The results show that less than 54.50% and 37.05% of apps in G and H, respectively, explicitly state in their privacy policies that they can provide users with copies of their personal data. Additionally, in both app markets, less than 20% of apps could truly provide users with their data copies. Finally, among the obtained data copies, only about 2.94% from G pass the completeness verification.
翻译:全球最新的数据保护法规,如《通用数据保护条例》(GDPR),确立了数据主体访问权(RADS),赋予用户向数据控制者访问并获取其个人数据副本的权利。该条款能有效促使数据控制者更谨慎地处理用户个人数据,对保护用户隐私具有重要意义。然而,目前尚无研究系统性地检验RADS是否在移动应用(最常见的个人数据控制者)中得到有效实施。本研究提出了一种针对应用中RADS合规性的测量框架。在该框架中,我们首先使用GPT-4等自然语言处理技术分析应用的隐私政策文本,以验证其是否明确向用户声明提供RADS,并说明行使该权利的具体方式。接着,我们通过向应用提交数据访问请求,评估所识别实施方法的真实性与可用性。最后,针对获取的数据副本,我们通过Frida Hook捕获应用运行时实际收集的用户个人数据,并与之对比以进一步验证数据副本的完整性。我们在美国应用市场G和中国应用市场H中分析了共计1,631款应用。结果显示,在G和H市场中,分别仅有不足54.50%和37.05%的应用在其隐私政策中明确声明可为用户提供个人数据副本。此外,在两个应用市场中,仅有不足20%的应用能实际为用户提供数据副本。最终,在已获取的数据副本中,仅约2.94%来自G市场的副本通过了完整性验证。