Synthesizing Evidence: Data-Pooling as a Tool for Treatment Selection in Online Experiments

Randomized experiments are the gold standard for causal inference but face significant challenges in business applications, including limited traffic allocation, the need for heterogeneous treatment effect estimation, and the complexity of managing overlapping experiments. These factors lead to high variability in treatment effect estimates, making data-driven policy roll out difficult. To address these issues, we introduce the data pooling treatment roll-out (DPTR) framework, which enhances policy roll-out by pooling data across experiments rather than focusing narrowly on individual ones. DPTR can effectively accommodate both overlapping and non-overlapping traffic scenarios, regardless of linear or nonlinear model specifications. We demonstrate the framework's robustness through a three-pronged validation: (a) theoretical analysis shows that DPTR surpasses the traditional difference-in-mean and ordinary least squares methods under non-overlapping experiments, particularly when the number of experiments is large; (b) synthetic simulations confirm its adaptability in complex scenarios with overlapping traffic, rich covariates and nonlinear specifications; and (c) empirical applications to two experimental datasets from real world platforms, demonstrating its effectiveness in guiding customized policy roll-outs for subgroups within a single experiment, as well as in coordinating policy deployments across multiple experiments with overlapping scenarios. By reducing estimation variability to improve decision-making effectiveness, DPTR provides a scalable, practical solution for online platforms to better leverage their experimental data in today's increasingly complex business environments.

翻译：随机对照实验是因果推断的金标准，但在商业应用中面临重大挑战，包括有限的流量分配、异质处理效应估计的需求以及重叠实验管理的复杂性。这些因素导致处理效应估计的高度变异性，使得基于数据的政策推广变得困难。为解决这些问题，我们提出了数据池化处理推广（DPTR）框架，该框架通过跨实验而非单一实验汇聚数据来增强政策推广效果。DPTR能有效适应重叠与非重叠流量场景，无论模型规范是线性还是非线性。我们通过三重验证证明该框架的稳健性：（a）理论分析表明，在非重叠实验下，尤其是当实验数量较大时，DPTR超越了传统的均值差法和普通最小二乘法；（b）合成模拟证实了其在具有重叠流量、丰富协变量和非线性规范等复杂场景中的适应性；（c）对来自真实世界平台的两个实验数据集的实证应用证明，该框架在指导单一实验中子组的定制化政策推广以及在重叠场景下协调跨多个实验的政策部署方面均具有有效性。通过减少估计变异性以提高决策有效性，DPTR为在线平台提供了可扩展、实用的解决方案，助力其在当今日益复杂的商业环境中更好地利用实验数据。