Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables (e.g., the socio-economic status of a user) that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. However, they cannot guarantee the identification of counterfactual feedback, which can lead to biased predictions. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables (e.g., observed user features) to resolve the aforementioned non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees. Extensive experiments on various real-world and synthetic datasets verify the proposed method's effectiveness and robustness.
翻译:推荐系统旨在预测用户对未接触物品的反馈。由于存在未测量的变量(例如用户的社会经济地位),这些变量可能同时影响用户的接触行为和反馈,从而产生混杂偏差。现有方法要么(1)对这些未测量变量做出难以维系的假设,要么(2)直接从用户接触数据中推断潜在混杂因素。然而,这些方法无法保证反事实反馈的可识别性,可能导致有偏预测。本文提出一种新型方法——可识别解混杂器(iDCF),通过利用一组代理变量(例如观测到的用户特征)来解决上述不可识别问题。iDCF是一种通用解混杂推荐框架,它采用近端因果推断技术来推断未测量混杂因素,并在理论保证下识别反事实反馈。在多种真实数据集与合成数据集上的大量实验验证了所提方法的有效性和鲁棒性。