Online controlled experiments face growing challenges from overlapping tests on shared traffic, where interactions between concurrent experiments obscure insights into feature combinations and produce effect estimates that do not correspond to any actionable launch scenario. While traffic splitting, layering, and sequential execution (non-concurrent) mitigate some of these issues, they require coordination overhead and can reduce experimentation velocity. We propose Multi-Experiment Analysis (MEA), a methodology for consistent joint estimation in the presence of arbitrary partial or full overlaps and multiple variants. MEA produces three types of estimates: (1) corrected individual treatment effects that account for the presence of overlapping experiments, (2) combined effects of launching any desired combination of variants across experiments, and (3) conditional effects of an experiment's variant given that specific variants of other experiments are launched or deramped -- all without requiring factorial pre-design or traffic restrictions. We validate the approach through comprehensive simulations confirming consistency and correct coverage. We report on production deployment at scale, illustrate the methodology through real-world use cases, and share practical lessons learned -- including system design, adoption patterns, and insights from production use.
翻译:线上受控实验面临日益增长的挑战:在共享流量上并发测试时,实验间的交互作用会掩盖特征组合的洞察,并产生无法对应任何可执行发布场景的效应估计。虽然流量分割、分层和顺序执行(非并发)能在一定程度上缓解这些问题,但它们需要协调成本且可能降低实验迭代速度。我们提出多实验分析(MEA)方法,可在存在任意部分或完全重叠及多重变体的场景下实现一致的联合估计。MEA生成三类估计结果:(1)考虑重叠实验影响后经校正的个体处理效应;(2)跨实验任意期望变体组合的联合发布效应;(3)给定其他实验特定变体发布或降级条件下,某实验变体的条件效应——所有这些均无需因子预设计或流量限制。我们通过全面模拟验证了该方法的一致性及正确覆盖能力。报告了大规模生产部署情况,通过实际案例阐释方法学,并分享了实践教训——包括系统设计、采纳模式及生产环境使用的洞察。