Interest-disclosing Mechanisms for Advertising are Privacy-Exposing (not Preserving)

Today, targeted online advertising relies on unique identifiers assigned to users through third-party cookies--a practice at odds with user privacy. While the web and advertising communities have proposed interest-disclosing mechanisms, including Google's Topics API, as solutions, an independent analysis of these proposals in realistic scenarios has yet to be performed. In this paper, we attempt to validate the privacy (i.e., preventing unique identification) and utility (i.e., enabling ad targeting) claims of Google's Topics proposal in the context of realistic user behavior. Through new statistical models of the distribution of user behaviors and resulting targeting topics, we analyze the capabilities of malicious advertisers observing users over time and colluding with other third parties. Our analysis shows that even in the best case, individual users' identification across sites is possible, as 0.4% of the 250k users we simulate are re-identified. These guarantees weaken further over time and when advertisers collude: 57% of users are uniquely re-identified after 15 weeks of browsing, increasing to 75% after 30 weeks. While measuring that the Topics API provides moderate utility, we also find that advertisers and publishers can abuse the Topics API to potentially assign unique identifiers to users, defeating the desired privacy guarantees. As a result, the inherent diversity of users' interests on the web is directly at odds with the privacy objectives of interest-disclosing mechanisms; we discuss how any replacement of third-party cookies may have to seek other avenues to achieve privacy for the web.

翻译：如今，定向在线广告依赖于通过第三方Cookie分配给用户的唯一标识符——这一做法与用户隐私相悖。尽管网络与广告界已提出包括谷歌Topics API在内的兴趣披露机制作为解决方案，但尚未有研究在真实场景下对这些方案进行独立分析。本文尝试在真实用户行为的背景下，验证谷歌Topics提案的隐私性（即防止唯一标识）与实用性（即实现广告定向）声明。通过构建用户行为分布及由此产生的定向主题的新统计模型，我们分析了恶意广告商随时间推移观察用户并与其他第三方共谋的能力。分析表明，即使在最佳情况下，跨站点识别单个用户仍存在可能：在模拟的25万用户中，有0.4%被重新识别。当广告商共谋时，这些保障随时间进一步削弱：经过15周浏览行为，57%的用户被唯一重识别；30周后，该比例上升至75%。在测算Topics API具有适度实用性的同时，我们还发现广告商与发布商可滥用该API为用户分配唯一标识符，从而破坏预期的隐私保障。因此，网络用户兴趣的内在多样性直接与兴趣披露机制的隐私目标相悖；我们讨论了任何替代第三方Cookie的方案都可能需要寻求其他途径来实现网络隐私保护。