We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale -- mean outcome under different treatments for each unit and each time -- with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to $\infty$ together at suitable rates. We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.
翻译:本文研究序贯设计实验的事后统计推断问题,其中多个单元在多个时间点通过随时间自适应的处理策略接受处理。我们的目标是在对自适应处理策略做出最少假设的前提下,为最细粒度的反事实均值——即每个单元在每个时间点接受不同处理时的潜在结果均值——提供推断保证。若不对反事实均值施加任何结构假设,由于未知参数数量超过观测数据点,这一挑战性任务将不可实现。为此,我们引入了一种基于反事实均值的潜因子模型,该模型可作为非线性混合效应模型及现有研究中双线性潜因子模型的非参数化推广。在估计方法上,我们采用非参数方法(即最近邻法的变体),并为每个单元每个时间点的反事实均值建立了非渐近的高概率误差界。在常规条件下,当单元数量与时间点数量以适当速率共同趋于$\infty$时,该误差界可导出反事实均值的渐近有效置信区间。我们通过数值模拟和一项基于移动健康临床试验HeartSteps数据的案例研究来阐释理论结果。