In machine learning, there is renewed interest in neural network ensembles (NNEs), whereby predictions are obtained as an aggregate from a diverse set of smaller models, rather than from a single larger model. Here, we show how to define and train a NNE using techniques from the study of rare trajectories in stochastic systems. We define an NNE in terms of the trajectory of the model parameters under a simple, and discrete in time, diffusive dynamics, and train the NNE by biasing these trajectories towards a small time-integrated loss, as controlled by appropriate counting fields which act as hyperparameters. We demonstrate the viability of this technique on a range of simple supervised learning tasks. We discuss potential advantages of our trajectory sampling approach compared with more conventional gradient based methods.
翻译:在机器学习领域,学界重新关注神经网络集成方法:相比单一大型模型,通过聚合一组多样化的小型模型获得预测结果。本文展示如何利用随机系统中稀有轨迹研究的技术来定义和训练神经网络集成。我们通过简单离散时间扩散动力学下的模型参数轨迹来定义神经网络集成,并通过将这些轨迹引导至较小的时间积分损失方向来训练集成——该损失由作为超参数的适当计数场控制。我们在多种简单监督学习任务上验证了该技术的可行性。最后讨论这种轨迹采样方法相较于传统梯度方法的潜在优势。