Randomized controlled trials typically assume that prognostic covariates are known and available at no cost. In practice, obtaining high-dimensional pretreatment data is costly, forcing a trade-off between covariate-adaptive precision and a measurement budget. We introduce Dynamic Adaptive Rerandomization via Thompson Sampling (DARTS), which treats covariate acquisition as a sequential optimization problem embedded within a design-based causal inference task. A budgeted combinatorial Thompson sampler learns which covariates are most prognostic across successive batches; selected covariates then drive rerandomization and regression adjustment to reduce batch-level average treatment effect variance. Our primary theoretical contribution is a decoupling result: adaptive covariate selection based on past batches preserves batch-level randomization validity, and the cumulative inverse-variance weighted estimator achieves at least nominal asymptotic coverage. We further derive a Bayes risk bound for the acquisition layer that matches the minimax lower bound up to logarithmic factors. Empirically, DARTS systematically concentrates the budget on informative features, significantly closing the efficiency gap to oracle designs while maintaining strict inferential validity.
翻译:随机对照试验通常假设预后协变量已知且无需成本即可获取。然而在实践中,获取高维预处理数据往往成本高昂,迫使研究者在协变量自适应精度与测量预算之间进行权衡。我们提出基于汤普森采样的动态自适应再随机化(DARTS)方法,将协变量获取视为嵌入基于设计的因果推断任务中的序贯优化问题。该预算约束下的组合汤普森采样器可逐批次学习最具预后价值的协变量;所选协变量继而驱动再随机化与回归调整,以降低批次级平均处理效应方差。本文的主要理论贡献在于解耦结果:基于历史批次的适应性协变量选择可保持批次级随机化有效性,且累积逆方差加权估计量至少能达到名义渐近覆盖水平。我们进一步推导出获取层的贝叶斯风险界,该界与极小化极大下界仅相差对数因子。实证表明,DARTS系统性地将预算集中在信息性特征上,在严格保持推断有效性的同时显著缩小了与理想设计之间的效率差距。