In recommender systems, users always choose the favorite items to rate, which leads to data missing not at random and poses a great challenge for unbiased evaluation and learning of prediction models. Currently, the doubly robust (DR) methods have been widely studied and demonstrate superior performance. However, in this paper, we show that DR methods are unstable and have unbounded bias, variance, and generalization bounds to extremely small propensities. Moreover, the fact that DR relies more on extrapolation will lead to suboptimal performance. To address the above limitations while retaining double robustness, we propose a stabilized doubly robust (StableDR) learning approach with a weaker reliance on extrapolation. Theoretical analysis shows that StableDR has bounded bias, variance, and generalization error bound simultaneously under inaccurate imputed errors and arbitrarily small propensities. In addition, we propose a novel learning approach for StableDR that updates the imputation, propensity, and prediction models cyclically, achieving more stable and accurate predictions. Extensive experiments show that our approaches significantly outperform the existing methods.
翻译:在推荐系统中,用户通常会选择他们偏爱的项目进行评分,这导致数据非随机缺失,并对预测模型的无偏评估与学习构成巨大挑战。目前,双重稳健(DR)方法已被广泛研究并展现出优越性能。然而,本文表明DR方法不稳定,且在极小的倾向性下存在无界的偏差、方差和泛化界限。此外,DR方法更依赖外推的事实将导致次优性能。为克服上述局限同时保持双重稳健性,我们提出了一种对弱外推依赖性的稳定双重稳健(StableDR)学习方法。理论分析表明,即使存在不准确的插补误差和任意小的倾向性,StableDR方法仍能同时获得有界的偏差、方差以及泛化误差界。同时,我们提出了一种新的StableDR学习方案,该方案通过循环更新插补模型、倾向性模型和预测模型,实现更稳定且精准的预测。大量实验表明,我们的方法显著优于现有方法。