Causal Lifting and Link Prediction

Current state-of-the-art causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent, i.e., the outcome of link interventions depends on existing links. For instance, in the customer-product graph of an online retailer, the effect of an 85-inch TV ad (treatment) likely depends on whether the costumer already has an 85-inch TV. Unfortunately, existing causal methods are impractical in these scenarios. The cascading functional dependencies between links (due to path dependence) are either unidentifiable or require an impractical number of control variables. In order to remedy this shortcoming, this work develops the first causal model capable of dealing with path dependencies in link prediction. It introduces the concept of causal lifting, an invariance in causal models that, when satisfied, allows the identification of causal link prediction queries using limited interventional data. On the estimation side, we show how structural pairwise embeddings -- a type of symmetry-based joint representation of node pairs in a graph -- exhibit lower bias and correctly represent the causal structure of the task, as opposed to existing node embedding methods, e.g., GNNs and matrix factorization. Finally, we validate our theoretical findings on four datasets under three different scenarios for causal link prediction tasks: knowledge base completion, covariance matrix estimation and consumer-product recommendations.

翻译：当前最先进的用于链接预测的因果模型假设存在一组潜在的节点固有因子（节点诞生时定义的内在特征），这些因子支配着图中链接的因果演化。然而，在某些因果任务中，链接形成具有路径依赖性，即链接干预的结果取决于现有链接。例如，在在线零售商的客户-产品图中，85英寸电视广告（处理）的效果很可能取决于客户是否已拥有85英寸电视。不幸的是，现有因果方法在这些场景中不实用。由于路径依赖，链接之间的级联功能依赖要么不可识别，要么需要数量不切实际的控制变量。为弥补这一不足，本文开发了首个能够处理链接预测中路径依赖的因果模型。它引入了因果提升概念，这是因果模型中的一种不变性，当满足此不变性时，可利用有限的干预数据识别因果链接预测查询。在估计方面，我们展示了结构成对嵌入（一种基于对称性的图节点对联合表示）相比现有节点嵌入方法（如GNN和矩阵分解），具有更低的偏差并能正确表示任务的因果结构。最后，我们在四种数据集上、三种不同场景下验证了我们的理论发现：知识库补全、协方差矩阵估计和消费者-产品推荐。

相关内容

链路预测

关注 14

网络中的链路预测(Link Prediction)是指如何通过已知的网络节点以及网络结构等信息预测网络中尚未产生连边的两个节点之间产生链接的可能性。这种预测既包含了对未知链接（exist yet unknown links）的预测也包含了对未来链接（future links）的预测。该问题的研究在理论和应用两个方面都具有重要的意义和价值。

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

专知会员服务

254+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日