Deep ensemble is a simple yet powerful way to improve the performance of deep neural networks. Under this motivation, recent works on mode connectivity have shown that parameters of ensembles are connected by low-loss subspaces, and one can efficiently collect ensemble parameters in those subspaces. While this provides a way to efficiently train ensembles, for inference, multiple forward passes should still be executed using all the ensemble parameters, which often becomes a serious bottleneck for real-world deployment. In this work, we propose a novel framework to reduce such costs. Given a low-loss subspace connecting two modes of a neural network, we build an additional neural network that predicts the output of the original neural network evaluated at a certain point in the low-loss subspace. The additional neural network, which we call a "bridge", is a lightweight network that takes minimal features from the original network and predicts outputs for the low-loss subspace without forward passes through the original network. We empirically demonstrate that we can indeed train such bridge networks and significantly reduce inference costs with the help of bridge networks.
翻译:深度集成是提升深度神经网络性能的一种简单而有效的方法。基于此动机,近期关于模式连通性的研究表明,集成参数由低损失子空间连接,且可在这些子空间中高效收集集成参数。虽然这为高效训练集成提供了途径,但在推理阶段仍需使用所有集成参数执行多次前向传播,这通常成为实际部署中的严重瓶颈。本文提出了一种降低此类成本的新框架。给定连接神经网络两个模式的低损失子空间,我们构建一个额外神经网络,用于预测原始网络在低损失子空间中某点处的输出。该额外网络(称为"桥接网络")是一个轻量级网络,它从原始网络中提取最小量特征,在不经过原始网络前向传播的情况下预测低损失子空间的输出。实验证明,我们确实能够训练此类桥接网络,并借助桥接网络显著降低推理成本。