This paper presents an in-depth analysis of the scale generalisation properties of the scale-covariant and scale-invariant Gaussian derivative networks, complemented with both conceptual and algorithmic extensions. For this purpose, Gaussian derivative networks are evaluated on new rescaled versions of the Fashion-MNIST and the CIFAR-10 datasets, with spatial scaling variations over a factor of 4 in the testing data, that are not present in the training data. Additionally, evaluations on the previously existing STIR datasets show that the Gaussian derivative networks achieve better scale generalisation than previously reported for these datasets for other types of deep networks. We first experimentally demonstrate that the Gaussian derivative networks have quite good scale generalisation properties on the new datasets, and that average pooling of feature responses over scales may sometimes also lead to better results than the previously used approach of max pooling over scales. Then, we demonstrate that using a spatial max pooling mechanism after the final layer enables localisation of non-centred objects in image domain, with maintained scale generalisation properties. We also show that regularisation during training, by applying dropout across the scale channels, referred to as scale-channel dropout, improves both the performance and the scale generalisation. In additional ablation studies, we demonstrate that discretisations of Gaussian derivative networks, based on the discrete analogue of the Gaussian kernel in combination with central difference operators, perform best or among the best, compared to a set of other discrete approximations of the Gaussian derivative kernels. Finally, by visualising the activation maps and the learned receptive fields, we demonstrate that the Gaussian derivative networks have very good explainability properties.
翻译:本文对尺度协变与尺度不变高斯导数网络的尺度泛化特性进行了深入分析,并辅以概念与算法层面的扩展。为此,我们在Fashion-MNIST和CIFAR-10数据集新构建的缩放版本上评估了高斯导数网络,其中测试数据包含训练数据中未出现的、缩放因子达4倍的空间尺度变化。此外,在现有STIR数据集上的评估表明,高斯导数网络相比先前报道的其他类型深度网络在这些数据集上实现了更好的尺度泛化。我们首先通过实验证明,高斯导数网络在新数据集上具有相当优异的尺度泛化特性,且特征响应在尺度维度上的平均池化有时能比先前使用的尺度维度最大池化获得更好结果。随后,我们证明在最终层后引入空间最大池化机制,能够在保持尺度泛化特性的同时,实现图像域中非中心目标的位置定位。我们还发现,在训练过程中通过跨尺度通道应用丢弃法(称为尺度通道丢弃)进行正则化,能够同时提升网络性能和尺度泛化能力。在进一步的消融实验中,我们证明基于高斯核离散模拟结合中心差分算子实现的高斯导数网络离散化方案,与一组其他高斯导数核离散近似方法相比,表现最佳或居于前列。最后,通过可视化激活图与学习到的感受野,我们证实高斯导数网络具备极佳的可解释性。