Image keypoint descriptions that are discriminative and matchable over large changes in viewpoint are vital for 3D reconstruction. However, descriptions output by learned descriptors are typically not robust to camera rotation. While they can be made more robust by, e.g., data augmentation, this degrades performance on upright images. Another approach is test-time augmentation, which incurs a significant increase in runtime. We instead learn a linear transform in description space that encodes rotations of the input image. We call this linear transform a steerer since it allows us to transform the descriptions as if the image was rotated. From representation theory we know all possible steerers for the rotation group. Steerers can be optimized (A) given a fixed descriptor, (B) jointly with a descriptor or (C) we can optimize a descriptor given a fixed steerer. We perform experiments in all of these three settings and obtain state-of-the-art results on the rotation invariant image matching benchmarks AIMS and Roto-360. We publish code and model weights at github.com/georg-bn/rotation-steerers.
翻译:在三维重建中,对大视角变化具有判别性和匹配性的图像关键点描述至关重要。然而,学习的描述符所输出的描述通常对相机旋转不具备鲁棒性。虽然可以通过数据增强等方法提升鲁棒性,但这会降低直立图像的匹配性能。另一个方法是测试时增强,但会显著增加运行时开销。我们转而学习描述空间中的一个线性变换,该变换编码输入图像的旋转,并将其命名为"转向器",因为它能让我们像旋转图像一样变换描述。根据表示理论可知,旋转群的所有可行转向器。转向器可通过以下方式优化:(A) 在固定描述符下,(B) 与描述符联合优化,或(C) 在固定转向器下优化描述符。我们在上述三种设定下开展实验,在旋转不变图像匹配基准AIMS和Roto-360上取得了最先进的结果。代码和模型权重已发布在github.com/georg-bn/rotation-steerers。