Linear regression adjustment is commonly used to analyse randomised controlled experiments due to its efficiency and robustness against model misspecification. Current testing and interval estimation procedures leverage the asymptotic distribution of such estimators to provide Type-I error and coverage guarantees that hold only at a single sample size. Here, we develop the theory for the anytime-valid analogues of such procedures, enabling linear regression adjustment in the sequential analysis of randomised experiments. We first provide sequential $F$-tests and confidence sequences for the parametric linear model, which provide time-uniform Type-I error and coverage guarantees that hold for all sample sizes. We then relax all linear model parametric assumptions in randomised designs and provide nonparametric model-free sequential tests and confidence sequences for treatment effects. This formally allows experiments to be continuously monitored for significance, stopped early, and safeguards against statistical malpractices in data collection. A particular feature of our results is their simplicity. Our test statistics and confidence sequences all emit closed-form expressions, which are functions of statistics directly available from a standard linear regression table. We illustrate our methodology with the sequential analysis of software A/B experiments at Netflix, performing regression adjustment with pre-treatment outcomes.
翻译:线性回归调整因其在模型误设情况下的高效性和稳健性,常被用于分析随机对照实验。当前的检验和区间估计方法利用此类估计量的渐近分布,仅在单一样本量下提供类型I错误率和覆盖概率保证。本文发展了这些方法的任意有效类似理论,使得在随机实验的序贯分析中实现线性回归调整成为可能。我们首先给出参数线性模型的序贯F检验和置信序列,这些序列提供对所有样本量均成立的时间一致类型I错误率和覆盖概率保证。随后,我们放宽随机设计中的全部线性模型参数假设,为非参数模型无关的序贯检验和置信序列提供治疗效应推断。这正式允许实验被持续监测显著性、提前终止,并防止数据收集中的统计不当行为。我们结果的一个显著特点是其简洁性:所有检验统计量和置信序列均具有闭合表达式,且这些表达式是标准线性回归表中可直接获取的统计量的函数。我们通过Netflix软件A/B实验的序贯分析来阐释该方法论,利用预处理结果进行回归调整。