The bootstrap is a popular data-driven method to quantify statistical uncertainty, but for modern high-dimensional problems, it could suffer from huge computational costs due to the need to repeatedly generate resamples and refit models. We study the use of bootstraps in high-dimensional environments with a small number of resamples. In particular, we show that with a recent "cheap" bootstrap perspective, using a number of resamples as small as one could attain valid coverage even when the dimension grows closely with the sample size, thus strongly supporting the implementability of the bootstrap for large-scale problems. We validate our theoretical results and compare the performance of our approach with other benchmarks via a range of experiments.
翻译:自助法是一种流行的数据驱动方法,用于量化统计不确定性,但在现代高维问题中,由于需要重复生成重抽样样本并重新拟合模型,该方法可能面临巨大的计算成本。我们研究了在重抽样样本数量较少的高维环境中使用自助法的问题。特别地,我们证明,借助一种最新的“廉价”自助法视角,即使使用少至一个重抽样样本,也能在维度随样本量紧密增长时实现有效的覆盖,从而有力支持了自助法在大规模问题中的可实施性。我们通过一系列实验验证了理论结果,并将我们的方法与其他基准方法进行了性能比较。