The widespread popularity of equivariant networks underscores the significance of parameter efficient models and effective use of training data. At a time when robustness to unseen deformations is becoming increasingly important, we present H-NeXt, which bridges the gap between equivariance and invariance. H-NeXt is a parameter-efficient roto-translation invariant network that is trained without a single augmented image in the training set. Our network comprises three components: an equivariant backbone for learning roto-translation independent features, an invariant pooling layer for discarding roto-translation information, and a classification layer. H-NeXt outperforms the state of the art in classification on unaugmented training sets and augmented test sets of MNIST and CIFAR-10.
翻译:等变网络的广泛普及凸显了参数高效模型和训练数据有效利用的重要性。在未见过形变的鲁棒性日益重要的时代,我们提出H-NeXt,它弥合了等变性和不变性之间的差距。H-NeXt是一种参数高效的旋转-平移不变网络,其训练过程中无需在训练集中使用任何增强图像。我们的网络包含三个组件:用于学习旋转-平移无关特征的等变主干网络、用于丢弃旋转-平移信息的不变池化层以及分类层。在未增强训练集以及增强测试集的MNIST和CIFAR-10分类任务中,H-NeXt的性能优于现有最先进方法。