Reinforcement learning methods have been used to optimize long-term user engagement in recommendation systems. However, existing reinforcement learning-based recommendation systems do not fully exploit the relevance of individual user behavior across different platforms. One potential solution is to aggregate data from various platforms in a centralized location and use the aggregated data for training. However, this approach raises economic and legal concerns, including increased communication costs and potential threats to user privacy. To address these challenges, we propose \textbf{FedSlate}, a federated reinforcement learning recommendation algorithm that effectively utilizes information that is prohibited from being shared at a legal level. We employ the SlateQ algorithm to assist FedSlate in learning users' long-term behavior and evaluating the value of recommended content. We extend the existing application scope of recommendation systems from single-user single-platform to single-user multi-platform and address cross-platform learning challenges by introducing federated learning. We use RecSim to construct a simulation environment for evaluating FedSlate and compare its performance with state-of-the-art benchmark recommendation models. Experimental results demonstrate the superior effects of FedSlate over baseline methods in various environmental settings, and FedSlate facilitates the learning of recommendation strategies in scenarios where baseline methods are completely inapplicable. Code is available at \textit{https://github.com/TianYaDY/FedSlate}.
翻译:强化学习方法已被用于优化推荐系统中的长期用户参与度。然而,现有的基于强化学习的推荐系统未能充分利用个体用户在不同平台间行为的相关性。一种潜在的解决方案是将来自不同平台的数据集中聚合,并使用聚合数据进行训练。然而,这种方法引发了经济与法律层面的担忧,包括通信成本增加以及对用户隐私的潜在威胁。为应对这些挑战,我们提出 \textbf{FedSlate}——一种联邦强化学习推荐算法,它能有效利用在法律层面禁止共享的信息。我们采用 SlateQ 算法辅助 FedSlate 学习用户的长期行为并评估推荐内容的价值。我们将推荐系统的现有应用范围从单用户单平台扩展至单用户多平台,并通过引入联邦学习解决跨平台学习难题。我们使用 RecSim 构建仿真环境以评估 FedSlate,并将其性能与前沿基准推荐模型进行比较。实验结果表明,在各种环境设置下 FedSlate 均优于基线方法,且在基线方法完全不适用的场景中 FedSlate 仍能有效学习推荐策略。代码发布于 \textit{https://github.com/TianYaDY/FedSlate}。