We propose the first study of adversarial attacks on online learning to rank. The goal of the adversary is to misguide the online learning to rank algorithm to place the target item on top of the ranking list linear times to time horizon $T$ with a sublinear attack cost. We propose generalized list poisoning attacks that perturb the ranking list presented to the user. This strategy can efficiently attack any no-regret ranker in general stochastic click models. Furthermore, we propose a click poisoning-based strategy named attack-then-quit that can efficiently attack two representative OLTR algorithms for stochastic click models. We theoretically analyze the success and cost upper bound of the two proposed methods. Experimental results based on synthetic and real-world data further validate the effectiveness and cost-efficiency of the proposed attack strategies.
翻译:我们首次提出针对在线学习排序的对抗攻击研究。攻击者旨在误导在线学习排序算法,使得目标项目在时间范围$T$内线性次出现在排序列表顶部,且攻击成本为次线性。我们提出一种通用列表投毒攻击策略,通过对用户展示的排序列表进行扰动。该策略能够有效攻击一般随机点击模型中任何无遗憾排序算法。此外,我们提出一种名为"先攻击后退出"的基于点击投毒的策略,可高效攻击两种针对随机点击模型的代表性OLTR算法。我们从理论上分析了这两种方法的成功概率与成本上限。基于合成数据和真实数据的实验结果进一步验证了所提攻击策略的有效性和成本效率。