This paper develops a framework for identification, estimation, and inference on the causal mechanisms driving endogenous social network formation. Identification is challenging because of unobserved confounders and reverse causality; inference is complicated by questions of equilibrium and sampling. We leverage repeated observations of a network over time and random variation in initial ties to address challenges to causal identification. Our design-based approach sidesteps questions of sampling and asymptotics by treating both the set of nodes (individuals) and potential outcomes as non-random. We apply our approach to data from a large professional services firm, where new hires are randomly assigned to project teams within offices. We estimate the causal effect on tie formation of indirect ties, network degree, and local network density. Indirect ties have a strong and significant positive effect on tie formation, while the effects of degree and density are smaller and less robust.
翻译:本文提出一个用于识别、估计和推断驱动内生社交网络形成因果机制的框架。由于存在未观测混杂因素和反向因果关系,因果识别面临挑战;而均衡与抽样问题又使推断复杂化。我们利用网络随时间重复观测的特征以及初始联系中的随机变异,来解决因果识别中的困难。我们的基于设计的分析方法将节点(个体)集合与潜在结果均视为非随机变量,从而规避了抽样与渐近理论问题。我们将该方法应用于一家大型专业服务公司的数据——该公司的新员工被随机分配到办公室内的项目团队中。我们估算了间接联系、网络度数及局部网络密度对联系形成的因果效应。结果表明,间接联系对联系形成具有显著且强烈的正向效应,而网络度数及局部网络密度的效应则较小且稳健性较弱。