This paper studies causal inference with observational network data. A challenging aspect of this setting is the possibility of interference in both potential outcomes and selection into treatment, for example due to peer effects in either stage. We therefore consider a nonparametric setup in which both stages are reduced forms of simultaneous-equations models. This results in high-dimensional network confounding, where the network and covariates of all units constitute sources of selection bias. The literature predominantly assumes that confounding can be summarized by a known, low-dimensional function of these objects, and it is unclear what selection models justify common choices of functions. We show that graph neural networks (GNNs) are well suited to adjust for high-dimensional network confounding. We establish a network analog of approximate sparsity under primitive conditions on interference. This demonstrates that the model has low-dimensional structure that makes estimation feasible and justifies the use of shallow GNN architectures.
翻译:本文研究基于观测网络数据的因果推断。该场景的一个挑战性方面在于潜在结果与处理选择中均可能存在干扰,例如两个阶段中可能出现的同伴效应。因此,我们考虑一个非参数设定,其中两个阶段均为联立方程模型的简化形式,这导致高维网络混淆问题——所有单元的网络与协变量构成选择偏差的来源。现有文献主要假设混淆可通过已知的低维函数进行概括,但尚无研究阐明何种选择模型能证明常用函数选择的合理性。我们证明图神经网络(GNN)能够有效调整高维网络混淆,并在干扰原始条件下建立近稀疏性的网络类比。这表明模型具有低维结构特征,使估计可行,并验证了浅层GNN架构的适用性。