Causal Lifting and Link Prediction

Existing causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent: The outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for path-dependent link formation, as the cascading functional dependencies between links (arising from path dependence) are either unidentifiable or require an impractical number of control variables. To overcome this, we develop the first causal model capable of dealing with path dependencies in link prediction. In this work we introduce the concept of causal lifting, an invariance in causal models of independent interest that, on graphs, allows the identification of causal link prediction queries using limited interventional data. Further, we show how structural pairwise embeddings exhibit lower bias and correctly represent the task's causal structure, as opposed to existing node embeddings, e.g., graph neural network node embeddings and matrix factorization. Finally, we validate our theoretical findings on three scenarios for causal link prediction tasks: knowledge base completion, covariance matrix estimation and consumer-product recommendations.

翻译：现有的链路预测因果模型假设存在一组潜在的节点固有因子——即节点生成时定义的内在特征——这些因子支配着图中链路的因果演化过程。然而，在某些因果任务中，链路的形成具有路径依赖性：链路干预的结果取决于现有链路。遗憾的是，现有因果方法并不适用于路径依赖型链路形成，因为链路间由路径依赖产生的级联功能依赖关系要么无法识别，要么需要数量不切实际的控制变量。为克服这一局限，我们开发了首个能够处理链路预测中路径依赖性的因果模型。本文提出因果提升概念——这一因果模型中的不变性具有独立研究价值，在图上能够利用有限的干预数据识别因果链路预测查询。此外，我们证明结构成对嵌入相较于现有节点嵌入（如图神经网络节点嵌入和矩阵分解）具有更低的偏差，并能正确表征任务的因果结构。最后，我们在三个因果链路预测任务场景中验证了理论成果：知识库补全、协方差矩阵估计以及消费者-产品推荐。

相关内容

链路预测

关注 14

网络中的链路预测(Link Prediction)是指如何通过已知的网络节点以及网络结构等信息预测网络中尚未产生连边的两个节点之间产生链接的可能性。这种预测既包含了对未知链接（exist yet unknown links）的预测也包含了对未来链接（future links）的预测。该问题的研究在理论和应用两个方面都具有重要的意义和价值。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日