Federated online learning to rank (FOLTR) aims to preserve user privacy by not sharing their searchable data and search interactions, while guaranteeing high search effectiveness, especially in contexts where individual users have scarce training data and interactions. For this, FOLTR trains learning to rank models in an online manner -- i.e. by exploiting users' interactions with the search systems (queries, clicks), rather than labels -- and federatively -- i.e. by not aggregating interaction data in a central server for training purposes, but by training instances of a model on each user device on their own private data, and then sharing the model updates, not the data, across a set of users that have formed the federation. Existing FOLTR methods build upon advances in federated learning. While federated learning methods have been shown effective at training machine learning models in a distributed way without the need of data sharing, they can be susceptible to attacks that target either the system's security or its overall effectiveness. In this paper, we consider attacks on FOLTR systems that aim to compromise their search effectiveness. Within this scope, we experiment with and analyse data and model poisoning attack methods to showcase their impact on FOLTR search effectiveness. We also explore the effectiveness of defense methods designed to counteract attacks on FOLTR systems. We contribute an understanding of the effect of attack and defense methods for FOLTR systems, as well as identifying the key factors influencing their effectiveness.
翻译:联邦在线学习排序(FOLTR)旨在通过不共享用户可搜索数据及搜索交互行为来保护用户隐私,同时确保高搜索有效性,尤其在个体用户训练数据及交互行为稀缺的场景下。为此,FOLTR以在线方式训练学习排序模型——即利用用户与搜索系统的交互行为(查询、点击)而非标签——并以联邦方式实现——即不在中央服务器聚合交互数据进行训练,而是在每个用户设备上基于其私有数据训练模型实例,随后在已组建联邦的用户集合间共享模型更新而非数据。现有FOLTR方法建立在联邦学习进展基础之上。尽管联邦学习已被证明能够在不需数据共享的情况下以分布式方式有效训练机器学习模型,但其可能易受针对系统安全性或整体有效性的攻击。本文聚焦于旨在破坏FOLTR系统搜索有效性的攻击行为。在此范畴内,我们实验并分析了数据投毒与模型投毒攻击方法,以展示其对FOLTR搜索有效性的影响。同时,我们探索了针对FOLTR系统攻击的防御方法有效性。本研究深化了对FOLTR系统攻击与防御方法影响机制的理解,并识别出影响其有效性的关键因素。