In computer vision, models must be able to adapt to changes in image resolution to effectively carry out tasks such as image segmentation; This is known as scale-equivariance. Recent works have made progress in developing scale-equivariant convolutional neural networks, e.g., through weight-sharing and kernel resizing. However, these networks are not truly scale-equivariant in practice. Specifically, they do not consider anti-aliasing as they formulate the down-scaling operation in the continuous domain. To address this shortcoming, we directly formulate down-scaling in the discrete domain with consideration of anti-aliasing. We then propose a novel architecture based on Fourier layers to achieve truly scale-equivariant deep nets, i.e., absolute zero equivariance-error. Following prior works, we test this model on MNIST-scale and STL-10 datasets. Our proposed model achieves competitive classification performance while maintaining zero equivariance-error.
翻译:在计算机视觉领域,模型必须能够适应图像分辨率的改变以有效执行图像分割等任务,这被称为尺度等变性。近期研究通过权重共享和卷积核尺寸调整等方法,在开发尺度等变卷积神经网络方面取得了进展。然而,这些网络在实践中并非真正的尺度等变。具体而言,它们在连续域中定义下采样操作时未考虑抗混叠。为弥补这一不足,我们直接在离散域中考虑抗混叠来定义下采样。进而提出一种基于傅里叶层的新型架构,实现了绝对零等变误差的真正尺度等变深度网络。参照先前研究,我们在MNIST-scale和STL-10数据集上测试该模型。我们的模型在保持零等变误差的同时,取得了具备竞争力的分类性能。