With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning rotated box (RBox) from the horizontal box (HBox) has attracted more and more attention. In this paper, we explore a more challenging yet label-efficient setting, namely single point-supervised OOD, and present our approach called Point2RBox. Specifically, we propose to leverage two principles: 1) Synthetic pattern knowledge combination: By sampling around each labeled point on the image, we spread the object feature to synthetic visual patterns with known boxes to provide the knowledge for box regression. 2) Transform self-supervision: With a transformed input image (e.g. scaled/rotated), the output RBoxes are trained to follow the same transformation so that the network can perceive the relative size/rotation between objects. The detector is further enhanced by a few devised techniques to cope with peripheral issues, e.g. the anchor/layer assignment as the size of the object is not available in our point supervision setting. To our best knowledge, Point2RBox is the first end-to-end solution for point-supervised OOD. In particular, our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives, 41.05%/27.62%/80.01% on DOTA/DIOR/HRSC datasets.
翻译:随着旋转目标检测(OOD)需求的快速增长,近期基于水平框(HBox)弱监督学习旋转框(RBox)的研究日益受到关注。本文探索了一种更具挑战性但标注效率更高的设定——单点监督OOD,并提出Point2RBox方法。具体而言,我们利用两个核心原理:1)合成模式知识融合:通过围绕图像中标定点进行采样,将目标特征扩散到已知框的合成视觉模式中,为框回归提供知识;2)变换自监督:对输入图像施加变换(如缩放/旋转)后,输出RBox被训练遵循相同的变换规则,从而使网络感知目标间的相对尺寸/旋转关系。此外,我们设计了若干辅助技术解决外围问题,例如在点监督设定中目标尺寸未知时,锚点/层级分配机制的适配。据我们所知,Point2RBox是首个点监督旋转目标检测的端到端解决方案。该方法采用轻量化范式,却在DOTA/DIOR/HRSC数据集上取得了具有竞争力的性能(41.05%/27.62%/80.01%),优于现有同类方法。