Traversing Between Modes in Function Space for Fast Ensembling

Deep ensemble is a simple yet powerful way to improve the performance of deep neural networks. Under this motivation, recent works on mode connectivity have shown that parameters of ensembles are connected by low-loss subspaces, and one can efficiently collect ensemble parameters in those subspaces. While this provides a way to efficiently train ensembles, for inference, multiple forward passes should still be executed using all the ensemble parameters, which often becomes a serious bottleneck for real-world deployment. In this work, we propose a novel framework to reduce such costs. Given a low-loss subspace connecting two modes of a neural network, we build an additional neural network that predicts the output of the original neural network evaluated at a certain point in the low-loss subspace. The additional neural network, which we call a "bridge", is a lightweight network that takes minimal features from the original network and predicts outputs for the low-loss subspace without forward passes through the original network. We empirically demonstrate that we can indeed train such bridge networks and significantly reduce inference costs with the help of bridge networks.

翻译：深度集成是提升深度神经网络性能的一种简单而有效的方法。基于此动机，近期关于模式连通性的研究表明，集成参数由低损失子空间连接，且可在这些子空间中高效收集集成参数。虽然这为高效训练集成提供了途径，但在推理阶段仍需使用所有集成参数执行多次前向传播，这通常成为实际部署中的严重瓶颈。本文提出了一种降低此类成本的新框架。给定连接神经网络两个模式的低损失子空间，我们构建一个额外神经网络，用于预测原始网络在低损失子空间中某点处的输出。该额外网络（称为"桥接网络"）是一个轻量级网络，它从原始网络中提取最小量特征，在不经过原始网络前向传播的情况下预测低损失子空间的输出。实验证明，我们确实能够训练此类桥接网络，并借助桥接网络显著降低推理成本。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日