In December of 2020, Apple started to require app developers to self-report privacy label annotations on their apps indicating what data is collected and how it is used.To understand the adoption and shifts in privacy labels in the App Store, we collected nearly weekly snapshots of over 1.6 million apps for over a year (July 15, 2021 -- October 25, 2022) to understand the dynamics of privacy label ecosystem. Nearly two years after privacy labels launched, only 70.1% of apps have privacy labels, but we observed an increase of 28% during the measurement period. Privacy label adoption rates are mostly driven by new apps rather than older apps coming into compliance. Of apps with labels, 18.1% collect data used to track users, 38.1% collect data that is linked to a user identity, and 42.0% collect data that is not linked. A surprisingly large share (41.8%) of apps with labels indicate that they do not collect any data, and while we do not perform direct analysis of the apps to verify this claim, we observe that it is likely that many of these apps are choosing a Does Not Collect label due to being forced to select a label, rather than this being the true behavior of the app. Moreover, for apps that have assigned labels during the measurement period nearly all do not change their labels, and when they do, the new labels indicate more data collection than less. This suggests that privacy labels may be a ``set once'' mechanism for developers that may not actually provide users with the clarity needed to make informed privacy decisions.
翻译:2020年12月,苹果开始要求应用开发者在其应用上自行报告隐私标签注释,说明收集了哪些数据以及如何使用这些数据。为了了解App Store中隐私标签的采用情况及变化,我们在一年多的时间里(2021年7月15日至2022年10月25日)收集了近160万个应用的几乎每周快照,以理解隐私标签生态系统的动态变化。在隐私标签推出近两年后,只有70.1%的应用拥有隐私标签,但我们观察到在测量期间这一比例增加了28%。隐私标签的采用率主要由新应用推动,而非旧应用逐步符合要求。在拥有标签的应用中,18.1%的应用收集用于跟踪用户的数据,38.1%的应用收集与用户身份关联的数据,42.0%的应用收集不关联的数据。令人惊讶的是,相当大比例(41.8%)的标签应用表示它们不收集任何数据,虽然我们未对应用进行直接分析以验证这一说法,但我们观察到,其中许多应用很可能是因为被迫选择标签而选择了“不收集数据”的标签,而非这代表了应用的真实行为。此外,在测量期间分配了标签的应用中,几乎所有应用都没有更改其标签,而当它们确实更改时,新标签表明收集的数据更多而非更少。这表明隐私标签对于开发者而言可能是一种“一次性设置”机制,实际上可能无法为用户提供做出知情隐私决策所需的清晰信息。