A/B testing on platforms often faces challenges from network interference, where a unit's outcome depends not only on its own treatment but also on the treatments of its network neighbors. To address this, cluster-level randomization has become standard, enabling the use of network-aware estimators. These estimators typically trim the data to retain only a subset of informative units, achieving low bias under suitable conditions but often suffering from high variance. In this paper, we first demonstrate that the interior nodes - units whose neighbors all lie within the same cluster - constitute the vast majority of the post-trimming subpopulation. In light of this, we propose directly averaging over the interior nodes to construct the mean-in-interior (MII) estimator, which circumvents the delicate reweighting required by existing network-aware estimators and substantially reduces variance in classical settings. However, we show that interior nodes are often not representative of the full population, particularly in terms of network-dependent covariates, leading to notable bias. We then augment the MII estimator with a counterfactual predictor trained on the entire network, allowing us to adjust for covariate distribution shifts between the interior nodes and full population. By rearranging the expression, we reveal that our augmented MII estimator embodies an analytical form of the point estimator within prediction-powered inference framework. This insight motivates a semi-supervised lens, wherein interior nodes are treated as labeled data subject to selection bias. Extensive and challenging simulation studies demonstrate the outstanding performance of our augmented MII estimator across various settings.
翻译:平台上的A/B测试常面临网络干扰的挑战,即单元的结果不仅取决于其自身处理,还受其网络邻居处理状态的影响。为解决此问题,集群级随机化已成为标准方法,使得网络感知估计器的使用成为可能。这类估计器通常会对数据进行修剪,仅保留信息丰富的单元子集,在适当条件下能实现低偏差,但往往伴随高方差。本文首先证明内部节点——即所有邻居均位于同一集群内的单元——构成了修剪后子群体的绝大多数。基于此发现,我们提出直接对内部节点进行平均以构建内部均值估计器,该方法规避了现有网络感知估计器所需的精细重加权过程,并在经典设定下显著降低了方差。然而,我们发现内部节点通常不能代表整体群体,特别是在网络相关协变量方面,这会导致显著偏差。为此,我们通过在整个网络上训练的反事实预测器对内部均值估计器进行增强,从而能够调整内部节点与整体群体间的协变量分布偏移。通过表达式重构,我们揭示增强后的内部均值估计器实质上体现了预测驱动推断框架中点估计器的解析形式。这一洞见启发了半监督视角的解读,即将内部节点视为存在选择偏差的标注数据。大量具有挑战性的仿真研究表明,增强后的内部均值估计器在多种设定下均表现出卓越性能。