AdFL：面向在线广告的浏览器内联邦学习 (AdFL: In-Browser Federated Learning for Online Advertisement)

Since most countries are coming up with online privacy regulations, such as GDPR in the EU, online publishers need to find a balance between revenue from targeted advertisement and user privacy. One way to be able to still show targeted ads, based on user personal and behavioral information, is to employ Federated Learning (FL), which performs distributed learning across users without sharing user raw data with other stakeholders in the publishing ecosystem. This paper presents AdFL, an FL framework that works in the browsers to learn user ad preferences. These preferences are aggregated in a global FL model, which is then used in the browsers to show more relevant ads to users. AdFL can work with any model that uses features available in the browser such as ad viewability, ad click-through, user dwell time on pages, and page content. The AdFL server runs at the publisher and coordinates the learning process for the users who browse pages on the publisher's website. The AdFL prototype does not require the client to install any software, as it is built utilizing standard APIs available on most modern browsers. We built a proof-of-concept model for ad viewability prediction that runs on top of AdFL. We tested AdFL and the model with two non-overlapping datasets from a website with 40K visitors per day. The experiments demonstrate AdFL's feasibility to capture the training information in the browser in a few milliseconds, show that the ad viewability prediction achieves up to 92.59% AUC, and indicate that utilizing differential privacy (DP) to safeguard local model parameters yields adequate performance, with only modest declines in comparison to the non-DP variant.

翻译：随着多数国家相继出台在线隐私法规（如欧盟的GDPR），在线出版商需要在定向广告收入与用户隐私之间寻求平衡。联邦学习（FL）作为一种基于用户个人与行为信息实现定向广告投放的可行方案，通过在发布生态系统中进行跨用户的分布式学习，且无需向其他参与方共享用户原始数据。本文提出AdFL——一种在浏览器中运行以学习用户广告偏好的联邦学习框架。这些偏好信息将聚合至全局联邦学习模型中，随后该模型在浏览器中被用于向用户展示更相关的广告。AdFL兼容任何使用浏览器可用特征的模型，例如广告可视性、广告点击率、用户页面停留时长及页面内容。AdFL服务器部署于出版商端，负责协调浏览出版商网站页面的用户学习过程。AdFL原型无需客户端安装任何软件，其构建完全基于大多数现代浏览器提供的标准API。我们在AdFL框架上构建了用于广告可视性预测的概念验证模型，并采用某日访问量4万次的网站中两个非重叠数据集对AdFL及模型进行测试。实验表明：AdFL能够在数毫秒内完成浏览器内训练信息采集；广告可视性预测的AUC最高可达92.59%；采用差分隐私（DP）技术保护本地模型参数可在保证性能充足的前提下，仅较非DP版本产生轻微性能下降。