This paper introduces a novel, computationally-efficient algorithm for predictive inference (PI) that requires no distributional assumptions on the data and can be computed faster than existing bootstrap-type methods for neural networks. Specifically, if there are $n$ training samples, bootstrap methods require training a model on each of the $n$ subsamples of size $n-1$; for large models like neural networks, this process can be computationally prohibitive. In contrast, our proposed method trains one neural network on the full dataset with $(\epsilon, \delta)$-differential privacy (DP) and then approximates each leave-one-out model efficiently using a linear approximation around the differentially-private neural network estimate. With exchangeable data, we prove that our approach has a rigorous coverage guarantee that depends on the preset privacy parameters and the stability of the neural network, regardless of the data distribution. Simulations and experiments on real data demonstrate that our method satisfies the coverage guarantees with substantially reduced computation compared to bootstrap methods.
翻译:本文提出了一种新颖且计算高效的预测推理算法,该算法无需对数据分布进行任何假设,且计算速度优于现有的神经网络自助法。具体而言,当训练样本量为$n$时,自助法需在每一个大小为$n-1$的子样本上训练模型;对于神经网络等大型模型,这一过程在计算上可能难以实现。相比之下,本文方法首先在完整数据集上训练一个满足$(\epsilon, \delta)$-差分隐私(DP)的神经网络,然后通过围绕差分隐私神经网络估计量的线性近似,高效地近似每一个留一法模型。在数据可交换的条件下,我们证明该方法具有严格的覆盖保证,该保证取决于预设的隐私参数和神经网络的稳定性,且与数据分布无关。仿真实验和真实数据实验均表明,与自助法相比,本文方法在显著降低计算量的同时能满足覆盖保证。