Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables (e.g., the socio-economic status of a user) that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. However, they cannot guarantee the identification of counterfactual feedback, which can lead to biased predictions. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables (e.g., observed user features) to resolve the aforementioned non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees. Extensive experiments on various real-world and synthetic datasets verify the proposed method's effectiveness and robustness.
翻译:推荐系统旨在预测用户对未曝光物品的反馈。由于存在影响用户曝光与反馈的未观测变量(如用户社会经济地位),混杂偏差随之产生。现有方法要么对未观测变量做出不切实际的假设,要么直接从用户曝光中推断潜在混杂因子。然而,这些方法无法保证反事实反馈的可辨识性,从而导致预测偏差。本文提出一种新型方法——可辨识去混杂器(iDCF),通过利用一组代理变量(如观测到的用户特征)解决上述不可辨识问题。所提出的iDCF是一个通用的去混杂推荐框架,它应用近端因果推断推测未观测混杂因子,并在理论上保证反事实反馈的可辨识性。在多种真实与合成数据集上的大量实验验证了该方法的有效性与鲁棒性。