Data protection legislation like the European Union's General Data Protection Regulation (GDPR) establishes the \textit{right to be forgotten}: a user (client) can request contributions made using their data to be removed from learned models. In this paper, we study how to remove the contributions made by a client participating in a Federated Online Learning to Rank (FOLTR) system. In a FOLTR system, a ranker is learned by aggregating local updates to the global ranking model. Local updates are learned in an online manner at a client-level using queries and implicit interactions that have occurred within that specific client. By doing so, each client's local data is not shared with other clients or with a centralised search service, while at the same time clients can benefit from an effective global ranking model learned from contributions of each client in the federation. In this paper, we study an effective and efficient unlearning method that can remove a client's contribution without compromising the overall ranker effectiveness and without needing to retrain the global ranker from scratch. A key challenge is how to measure whether the model has unlearned the contributions from the client $c^*$ that has requested removal. For this, we instruct $c^*$ to perform a poisoning attack (add noise to this client updates) and then we measure whether the impact of the attack is lessened when the unlearning process has taken place. Through experiments on four datasets, we demonstrate the effectiveness and efficiency of the unlearning strategy under different combinations of parameter settings.
翻译:数据保护法规(如欧盟《通用数据保护条例》(GDPR))确立了"被遗忘权":用户(客户)可要求删除基于其数据做出的贡献对已学习模型的影响。本文研究如何移除参与联邦在线学习排序(FOLTR)系统的客户所做出的贡献。在FOLTR系统中,排序模型通过聚合对全局排序模型的本地更新进行学习。本地更新以在线方式在客户层面利用该客户特有的查询和隐式交互进行学习。通过这种方式,每个客户的本地数据不会与其他客户或集中式搜索服务共享,同时客户也能从联邦中各客户贡献所学习的有效全局排序模型中获益。本文研究一种高效且有效的遗忘方法,该方法可在不影响整体排序器性能且无需从头重新训练全局排序器的情况下移除客户贡献。一个关键挑战是如何衡量模型是否已遗忘请求移除的客户$c^*$的贡献。为此,我们指示$c^*$执行投毒攻击(向该客户更新中添加噪声),然后测量当遗忘过程发生后攻击影响是否减弱。通过在四个数据集上的实验,我们证明了该遗忘策略在不同参数设置组合下的有效性和效率。