Most iterative neural network training methods use estimates of the loss function over small random subsets (or minibatches) of the data to update the parameters, which aid in decoupling the training time from the (often very large) size of the training datasets. Here, we show that a minibatch approach can also be used to train neural network ensembles (NNEs) via trajectory methods in a highly efficent manner. We illustrate this approach by training NNEs to classify images in the MNIST datasets. This method gives an improvement to the training times, allowing it to scale as the ratio of the size of the dataset to that of the average minibatch size which, in the case of MNIST, gives a computational improvement typically of two orders of magnitude. We highlight the advantage of using longer trajectories to represent NNEs, both for improved accuracy in inference and reduced update cost in terms of the samples needed in minibatch updates.
翻译:大多数迭代式神经网络训练方法利用数据中随机小子集(或小批量)上的损失函数估计值来更新参数,这有助于将训练时间与训练数据集(通常规模极大)的大小解耦。本文证明,小批量方法同样可高效地通过轨迹方法训练神经网络集成(NNEs)。我们以MNIST数据集中的图像分类任务为例进行NNEs训练,演示了该方法。该方法可提升训练效率,使计算复杂度随数据集大小与平均小批量大小的比值缩放——在MNIST案例中,计算效率通常提升两个数量级。我们强调了使用更长轨迹表示NNEs的优势,既能提升推理精度,又能减少小批量更新所需的样本更新成本。