Inferring causal effects on long-term outcomes using short-term surrogates is crucial to rapid innovation. However, even when treatments are randomized and surrogates fully mediate their effect on outcomes, it's possible that we get the direction of causal effects wrong due to confounding between surrogates and outcomes -- a situation famously known as the surrogate paradox. The availability of many historical experiments offer the opportunity to instrument for the surrogate and bypass this confounding. However, even as the number of experiments grows, two-stage least squares has non-vanishing bias if each experiment has a bounded size, and this bias is exacerbated when most experiments barely move metrics, as occurs in practice. We show how to eliminate this bias using cross-fold procedures, JIVE being one example, and construct valid confidence intervals for the long-term effect in new experiments where long-term outcome has not yet been observed. Our methodology further allows to proxy for effects not perfectly mediated by the surrogates, allowing us to handle both confounding and effect leakage as violations of standard statistical surrogacy conditions.
翻译:利用短期替代变量推断长期结果上的因果效应,是推动快速创新的关键。然而,即使处理变量被随机化且替代变量完全中介了其对结果的影响,我们仍可能因替代变量与结果之间的混杂而错误判断因果效应的方向——这一著名现象被称为“替代悖论”。大量历史实验数据的可获得性,为替代变量提供了工具变量,从而能绕过这种混杂。但即便实验数量增加,若每个实验规模有界,两阶段最小二乘法仍存在非衰减偏差;当多数实验对指标影响微弱时(如实际场景中常见的现象),偏差会进一步加剧。我们证明如何通过交叉折叠程序(以JIVE为例)消除此类偏差,并在尚未观测到长期结果的新实验中构建长期效应的有效置信区间。该方法进一步允许对未完全由替代变量中介的效应进行代理处理,从而同时应对混杂与效应泄露这两种违反标准统计替代条件的情况。