Rotated object detection aims to identify and locate objects in images with arbitrary orientation. In this scenario, the oriented directions of objects vary considerably across different images, while multiple orientations of objects exist within an image. This intrinsic characteristic makes it challenging for standard backbone networks to extract high-quality features of these arbitrarily orientated objects. In this paper, we present Adaptive Rotated Convolution (ARC) module to handle the aforementioned challenges. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images, and an efficient conditional computation mechanism is introduced to accommodate the large orientation variations of objects within an image. The two designs work seamlessly in rotated object detection problem. Moreover, ARC can conveniently serve as a plug-and-play module in various vision backbones to boost their representation ability to detect oriented objects accurately. Experiments on commonly used benchmarks (DOTA and HRSC2016) demonstrate that equipped with our proposed ARC module in the backbone network, the performance of multiple popular oriented object detectors is significantly improved (\eg +3.03\% mAP on Rotated RetinaNet and +4.16\% on CFA). Combined with the highly competitive method Oriented R-CNN, the proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77\% mAP. Code is available at \url{https://github.com/LeapLabTHU/ARC}.
翻译:旋转目标检测旨在识别并定位图像中具有任意朝向的物体。在此场景下,不同图像中物体的朝向方向差异显著,而同一图像内又可能存在多种朝向的物体。这一固有特性使得标准骨干网络难以提取这些任意朝向物体的高质量特征。本文提出自适应旋转卷积(ARC)模块以应对上述挑战。在ARC模块中,卷积核能够自适应旋转以提取不同图像中具有不同朝向的物体特征,同时引入高效条件计算机制来适应图像内物体的大范围朝向变化。这两种设计在旋转目标检测问题中协同工作。此外,ARC可作为即插即用模块便捷地集成到多种视觉骨干网络中,提升它们精确检测旋转目标的表示能力。在常用基准数据集(DOTA和HRSC2016)上的实验表明,将所提出的ARC模块嵌入骨干网络后,多种主流旋转目标检测器的性能显著提升(例如,旋转RetinaNet的mAP提升+3.03%,CFA提升+4.16%)。结合极具竞争力的Oriented R-CNN方法,所提方法在DOTA数据集上以81.77%的mAP达到最先进性能。代码已开源:\url{https://github.com/LeapLabTHU/ARC}。