We introduce the "cram" method, a general and efficient approach to simultaneous learning and evaluation using a generic machine learning (ML) algorithm. In a single pass of batched data, the proposed method repeatedly trains an ML algorithm and tests its empirical performance. Because it utilizes the entire sample for both learning and evaluation, cramming is significantly more data-efficient than sample-splitting. The cram method also naturally accommodates online learning algorithms, making its implementation computationally efficient. To demonstrate the power of the cram method, we consider the standard policy learning setting where cramming is applied to the same data to both develop an individualized treatment rule (ITR) and estimate the average outcome that would result if the learned ITR were to be deployed. We show that under a minimal set of assumptions, the resulting crammed evaluation estimator is consistent and asymptotically normal. While our asymptotic results require a relatively weak stabilization condition of ML algorithm, we develop a simple, generic method that can be used with any policy learning algorithm to satisfy this condition. Our extensive simulation studies show that, when compared to sample-splitting, cramming reduces the evaluation standard error by more than 40% while improving the performance of learned policy. We also apply the cram method to a randomized clinical trial to demonstrate its applicability to real-world problems. Finally, we briefly discuss future extensions of the cram method to other learning and evaluation settings.
翻译:我们提出“Cram”方法,这是一种利用通用机器学习算法实现同步学习与评估的通用高效方法。该方法在单批次数据的一次遍历中,反复训练机器学习算法并测试其实证性能。由于同时使用全部样本进行学习与评估,Cram方法在数据效率上显著优于样本分割法。该方法天然支持在线学习算法,因此计算实现高效。为展示Cram方法的威力,我们以标准政策学习场景为例:Cram方法被应用于同一数据集,既开发个体化治疗方案(ITR),又估计部署该ITR后的平均结果。我们证明,在最小化假设条件下,Cram评估估计量具有一致性与渐近正态性。尽管渐近结果要求机器学习算法满足相对较弱的稳定性条件,我们开发了一种适用于任意政策学习算法的简易通用方法以满足该条件。大量仿真研究表明,与样本分割法相比,Cram方法将评估标准误降低超40%,同时提升了学习政策的性能。我们还将Cram方法应用于随机临床试验,验证其在真实世界问题中的适用性。最后,简要讨论了Cram方法在未来其他学习与评估场景中的扩展方向。