Rotated object detection aims to identify and locate objects in images with arbitrary orientation. In this scenario, the oriented directions of objects vary considerably across different images, while multiple orientations of objects exist within an image. This intrinsic characteristic makes it challenging for standard backbone networks to extract high-quality features of these arbitrarily orientated objects. In this paper, we present Adaptive Rotated Convolution (ARC) module to handle the aforementioned challenges. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images, and an efficient conditional computation mechanism is introduced to accommodate the large orientation variations of objects within an image. The two designs work seamlessly in rotated object detection problem. Moreover, ARC can conveniently serve as a plug-and-play module in various vision backbones to boost their representation ability to detect oriented objects accurately. Experiments on commonly used benchmarks (DOTA and HRSC2016) demonstrate that equipped with our proposed ARC module in the backbone network, the performance of multiple popular oriented object detectors is significantly improved (e.g. +3.03% mAP on Rotated RetinaNet and +4.16% on CFA). Combined with the highly competitive method Oriented R-CNN, the proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
翻译:旋转目标检测旨在识别并定位图像中任意方向的物体。在此场景下,不同图像中目标的朝向差异显著,而同一图像内又存在多个方向的目标。这一内在特性使得标准骨干网络难以提取这些任意方向目标的高质量特征。本文提出自适应旋转卷积模块,以应对上述挑战。该模块通过卷积核自适应旋转,提取不同图像中具有不同朝向的目标特征,并引入高效条件计算机制以适应图像内目标朝向的大幅度变化。两种设计在旋转目标检测问题中无缝协同。此外,ARC可便捷地作为即插即用模块集成至多种视觉骨干网络,提升其准确检测有向目标的表征能力。在常用基准数据集(DOTA和HRSC2016)上的实验表明,将所提ARC模块嵌入骨干网络后,多种流行有向目标检测器的性能显著提升(例如,Rotated RetinaNet的mAP提高3.03%,CFA提高4.16%)。结合高竞争力方法Oriented R-CNN,所提方法在DOTA数据集上以81.77%的mAP达到最先进性能。