We develop a method to decompose causal effects on a social network into an indirect effect mediated by the network, and a direct effect independent of the social network. To handle the complexity of network structures, we assume that latent social groups act as causal mediators. We develop principal components network regression models to differentiate the social effect from the non-social effect. Fitting the regression models is as simple as principal components analysis followed by ordinary least squares estimation. We prove asymptotic theory for regression coefficients from this procedure and show that it is widely applicable, allowing for a variety of distributions on the regression errors and network edges. We carefully characterize the counterfactual assumptions necessary to use the regression models for causal inference, and show that current approaches to causal network regression may result in over-control bias. The method is very general, so that it is applicable to many types of structured data beyond social networks, such as text, areal data, psychometrics, images and omics.
翻译:我们提出了一种方法,将社会网络上的因果效应分解为网络介导的间接效应和独立于社会网络的直接效应。为处理网络结构的复杂性,我们假设潜在社会群体充当因果中介。我们发展了主成分网络回归模型以区分社会效应与非社会效应。拟合该回归模型如同进行主成分分析后执行普通最小二乘估计一样简单。我们证明了该过程回归系数的渐近理论,并表明其具有广泛适用性,允许回归误差和网络边的多种分布形式。我们细致刻画了使用该回归模型进行因果推断所需的反事实假设,并指出当前因果网络回归方法可能导致过度控制偏误。该方法具有高度通用性,可适用于社会网络之外的多种结构化数据类型,如文本、面状数据、心理测量数据、图像及组学数据。