This work studies the problem of learning unbiased algorithms from biased feedback for recommendation. We address this problem from a novel distribution shift perspective. Recent works in unbiased recommendation have advanced the state-of-the-art with various techniques such as re-weighting, multi-task learning, and meta-learning. Despite their empirical successes, most of them lack theoretical guarantees, forming non-negligible gaps between theories and recent algorithms. In this paper, we propose a theoretical understanding of why existing unbiased learning objectives work for unbiased recommendation. We establish a close connection between unbiased recommendation and distribution shift, which shows that existing unbiased learning objectives implicitly align biased training and unbiased test distributions. Built upon this connection, we develop two generalization bounds for existing unbiased learning methods and analyze their learning behavior. Besides, as a result of the distribution shift, we further propose a principled framework, Adversarial Self-Training (AST), for unbiased recommendation. Extensive experiments on real-world and semi-synthetic datasets demonstrate the effectiveness of AST.
翻译:本研究从有偏反馈中学习无偏推荐算法的问题出发,以新颖的分布偏移视角对其进行探讨。近期无偏推荐领域的研究通过重加权、多任务学习和元学习等多种技术取得了显著进展。尽管这些方法在实验上表现出色,但大多缺乏理论保障,导致理论与现有算法之间存在不可忽视的鸿沟。本文对现有无偏学习目标为何适用于无偏推荐提供了理论解释。我们建立了无偏推荐与分布偏移之间的紧密联系,表明现有无偏学习目标隐式地使有偏训练分布与无偏测试分布对齐。基于这一联系,我们推导了现有无偏学习方法的两类泛化界,并分析了其学习行为。此外,从分布偏移出发,我们进一步提出了一个名为对抗性自训练(AST)的通用框架,用于无偏推荐。在真实数据集和半合成数据集上的大量实验验证了AST的有效性。