Linear regression adjustment is commonly used to analyse randomised controlled experiments due to its efficiency and robustness against model misspecification. Current testing and interval estimation procedures leverage the asymptotic distribution of such estimators to provide Type-I error and coverage guarantees that hold only at a single sample size. Here, we develop the theory for the anytime-valid analogues of such procedures, enabling linear regression adjustment in the sequential analysis of randomised experiments. We first provide sequential $F$-tests and confidence sequences for the parametric linear model, which provide time-uniform Type-I error and coverage guarantees that hold for all sample sizes. We then relax all linear model parametric assumptions in randomised designs and provide nonparametric model-free sequential tests and confidence sequences for treatment effects. This formally allows experiments to be continuously monitored for significance, stopped early, and safeguards against statistical malpractices in data collection. A particular feature of our results is their simplicity. Our test statistics and confidence sequences all emit closed-form expressions, which are functions of statistics directly available from a standard linear regression table. We illustrate our methodology with the sequential analysis of software A/B experiments at Netflix, performing regression adjustment with pre-treatment outcomes.
翻译:线性回归调整因其高效性及对模型误设的稳健性,常被用于分析随机对照实验。当前的检验与区间估计方法利用此类估计量的渐近分布,仅能在单一样本量下保证第一类错误率和覆盖率。本文发展了此类方法的全时段有效理论,使线性回归调整能够用于随机实验的序贯分析。首先,我们针对参数线性模型提出序贯F检验和置信序列,为所有样本量提供时间一致的第一类错误率和覆盖率保证。随后,我们在随机化设计中放宽所有线性模型参数假设,为非参数模型提供无模型假设的序贯检验和治疗效应置信序列。这正式允许实验被连续监测显著性、提前终止,并防止数据收集中的统计不当行为。我们结果的一个独特之处在于其简洁性:所有检验统计量和置信序列均具有闭合表达式,且可直接由标准线性回归表中的统计量计算得出。我们通过Netflix软件A/B实验的序贯分析(利用预处理结果进行回归调整)展示了该方法的应用。