We present fastrerandomize, an R package for fast, scalable rerandomization in experimental design. Rerandomization improves precision by discarding treatment assignments that fail a prespecified covariate-balance criterion, but existing implementations can become computationally prohibitive as the number of units or covariates grows. fastrerandomize introduces three complementary advances: (i) optional GPU/TPU acceleration to parallelize balance checks, (ii) memory-efficient key-only storage that avoids retaining full assignment matrices, and (iii) auto-vectorized, just-in-time compiled kernels for batched candidate generation and inference. This approach enables exact or Monte Carlo rerandomization at previously intractable scales, making it practical to adopt the tighter balance thresholds required in modern high-dimensional experiments while simultaneously quantifying the resulting gains in precision and power for a given covariate set. Our approach also supports randomization-based testing conditioned on acceptance. In controlled benchmarks, we observe order-of-magnitude speedups over baseline workflows, with larger gains as the sample size or dimensionality grows, translating into improved precision of causal estimates.
翻译:本文介绍fastrerandomize,一个用于实验设计中快速、可扩展再随机化的R软件包。再随机化通过舍弃不满足预设协变量平衡准则的处理分配来提高精度,但随着实验单元或协变量数量的增加,现有实现的计算成本可能变得过高。fastrerandomize引入了三项互补性改进:(i)可选的GPU/TPU加速以并行化平衡检验,(ii)避免保留完整分配矩阵的高内存效率键值存储方案,以及(iii)采用自动向量化即时编译内核进行批量化候选方案生成与推断。该方法能够在以往难以处理的规模上实现精确或蒙特卡洛再随机化,使得在现代高维实验中采用更严格的平衡阈值变得可行,同时能够量化给定协变量集所带来的精度与功效提升。本方法还支持基于接受条件的随机化检验。在受控基准测试中,我们观察到相较于基线工作流程的数量级加速,且随着样本量或维度的增加加速效果更为显著,这最终转化为因果估计精度的提升。