Learning the Covariance of Treatment Effects Across Many Weak Experiments

When primary objectives are insensitive or delayed, experimenters may instead focus on proxy metrics derived from secondary outcomes. For example, technology companies often infer long-term impacts of product interventions from their effects on weighted indices of short-term user engagement signals. We consider meta-analysis of many historical experiments to learn the covariance of treatment effects on different outcomes, which can support the construction of such proxies. Even when experiments are plentiful and large, if treatment effects are weak, the sample covariance of estimated treatment effects across experiments can be highly biased and remains inconsistent even as more experiments are considered. We overcome this by using techniques inspired by weak instrumental variable analysis, which we show can reliably estimate parameters of interest, even without a structural model. We show the Limited Information Maximum Likelihood (LIML) estimator learns a parameter that is equivalent to fitting total least squares to a transformation of the scatterplot of estimated treatment effects, and that Jackknife Instrumental Variables Estimation (JIVE) learns another parameter that can be computed from the average of Jackknifed covariance matrices across experiments. We also present a total-covariance-based estimator for the latter estimand under homoskedasticity, which we show is equivalent to a $k$-class estimator. We show how these parameters relate to causal quantities and can be used to construct unbiased proxy metrics under a structural model with both direct and indirect effects subject to the INstrument Strength Independent of Direct Effect (INSIDE) assumption of Mendelian randomization. Lastly, we discuss the application of our methods at Netflix.

翻译：当主要目标不敏感或存在延迟时，实验者可能会转而关注源自次要结果的代理指标。例如，科技公司常通过产品干预对短期用户参与度信号加权指数的影响来推断其长期效应。我们考虑对大量历史实验进行元分析，以学习不同结果上处理效应的协方差，从而支持此类代理指标的构建。即使实验数量充足且规模庞大，若处理效应较弱，则估计处理效应在实验间的样本协方差可能高度有偏，且随实验数量增加仍不一致。我们通过借鉴弱工具变量分析的技术克服这一问题，研究表明即便缺乏结构模型，该方法也能可靠估计目标参数。我们证明有限信息最大似然（LIML）估计量学习的参数等价于对估计处理效应散点图变换进行总体最小二乘拟合，而刀切工具变量估计（JIVE）学习的参数可通过跨实验刀切协方差矩阵的平均值计算。我们还在同方差假设下提出后一估计量的总协方差估计方法，并证明其等价于k类估计量。我们展示了这些参数如何与因果量关联，并可在符合孟德尔随机化中"工具强度独立于直接效应"（INSIDE）假设的结构模型下（同时包含直接效应与间接效应），用于构建无偏代理指标。最后，我们讨论了这些方法在奈飞（Netflix）的应用。