Object instances in remote sensing images often distribute with multi-orientations, varying scales, and dense distribution. These issues bring challenges to end-to-end oriented object detectors including multi-scale features alignment and a large number of queries. To address these limitations, we propose an end-to-end oriented detector equipped with an efficient decoder, which incorporates two technologies, Rotated RoI attention (RRoI attention) and Selective Distinct Queries (SDQ). Specifically, RRoI attention effectively focuses on oriented regions of interest through a cross-attention mechanism and aligns multi-scale features. SDQ collects queries from intermediate decoder layers and then filters similar queries to obtain distinct queries. The proposed SDQ can facilitate the optimization of one-to-one label assignment, without introducing redundant initial queries or extra auxiliary branches. Extensive experiments on five datasets demonstrate the effectiveness of our method. Notably, our method achieves state-of-the-art performance on DIOR-R (67.31% mAP), DOTA-v1.5 (67.43% mAP), and DOTA-v2.0 (53.28% mAP) with the ResNet50 backbone.
翻译:遥感图像中的目标实例常呈现多方向、多尺度及密集分布等特性,这给端到端定向目标检测器带来了多尺度特征对齐和大量查询(queries)等挑战。针对上述局限性,我们提出了一种配备高效解码器的端到端定向检测器,该解码器融合了旋转RoI注意力(Rotated RoI attention, RRoI attention)和选择性差异化查询(Selective Distinct Queries, SDQ)两项技术。具体而言,RRoI注意力通过交叉注意力机制有效聚焦于定向感兴趣区域,并实现多尺度特征对齐。SDQ从中间解码器层收集查询,过滤相似查询以获得差异化查询。所提出的SDQ无需引入冗余初始查询或额外辅助分支,即可促进一对一标签分配的优化。在五个数据集上的大量实验验证了本方法的有效性。值得注意的是,在ResNet50骨干网络下,本方法在DIOR-R(67.31% mAP)、DOTA-v1.5(67.43% mAP)和DOTA-v2.0(53.28% mAP)上均取得了最优性能。