In online experiments where the intervention is only exposed, or "triggered", for a small subset of the population, it is critical to use variance reduction techniques to estimate treatment effects with sufficient precision to inform business decisions. Trigger-dilute analysis is often used in these situations, and reduces the sampling variance of overall intent-to-treat (ITT) effects by an order of magnitude equal to the inverse of the triggering rate; for example, a triggering rate of $5\%$ corresponds to roughly a $20x$ reduction in variance. To apply trigger-dilute analysis, one needs to know experimental subjects' triggering counterfactual statuses, i.e., the counterfactual behavior of subjects under both treatment and control conditions. In this paper, we propose an unbiased ITT estimator with reduced variance applicable for experiments where the triggering counterfactual status is only observed in the treatment group. Our method is based on the efficiency augmentation idea of CUPED and draws upon identification frameworks from the principal stratification and instrumental variables literature. The unbiasedness of our estimation approach relies on a testable assumption that the augmentation term used for covariate adjustment equals zero in expectation. Unlike traditional covariate adjustment or principal score modeling approaches, our estimator can incorporate both pre-experiment and in-experiment observations. We demonstrate through a real-world experiment and simulations that our estimator can remain unbiased and achieve precision improvements as large as if triggering status were fully observed, and in some cases can even outperform trigger-dilute analysis.
翻译:在干预仅暴露于(即“触发”)少数人群的在线实验中,使用方差缩减技术以足够精度估计处理效应来指导业务决策至关重要。触发稀释分析常用于此类场景,可将整体意向治疗(ITT)效应的抽样方差降低与触发率倒数相等的量级;例如,触发率为$5\%$时,方差约缩减20倍。要应用触发稀释分析,需获知实验主体的触发反事实状态,即主体在治疗与控制两种条件下的反事实行为。本文针对仅能观测治疗组触发反事实状态的实验,提出一种无偏且方差缩减的ITT估计量。该方法基于CUPED的效率增强思想,并借鉴了主分层与工具变量文献中的识别框架。其无偏性依赖于一项可检验假设:用于协变量调整的增强项期望值为零。与传统协变量调整或主得分建模方法不同,本估计量可同时纳入实验前与实验内观测数据。通过真实实验与模拟验证,该估计量能在保持无偏性的同时,实现与完全观测触发状态相当的精度提升效果,某些情况下甚至优于触发稀释分析。