Oriented object detection, a specialized subfield in computer vision, finds applications across diverse scenarios, excelling particularly when dealing with objects of arbitrary orientations. Conversely, point annotation, which treats objects as single points, offers a cost-effective alternative to rotated and horizontal bounding boxes but sacrifices performance due to the loss of size and orientation information. In this study, we introduce the P2RBox network, which leverages point annotations and a mask generator to create mask proposals, followed by filtration through our Inspector Module and Constrainer Module. This process selects high-quality masks, which are subsequently converted into rotated box annotations for training a fully supervised detector. Specifically, we've thoughtfully crafted an Inspector Module rooted in multi-instance learning principles to evaluate the semantic score of masks. We've also proposed a more robust mask quality assessment in conjunction with the Constrainer Module. Furthermore, we've introduced a Symmetry Axis Estimation (SAE) Module inspired by the spectral theorem for symmetric matrices to transform the top-performing mask proposal into rotated bounding boxes. P2RBox performs well with three fully supervised rotated object detectors: RetinaNet, Rotated FCOS, and Oriented R-CNN. By combining with Oriented R-CNN, P2RBox achieves 62.26% on DOTA-v1.0 test dataset. As far as we know, this is the first attempt at training an oriented object detector with point supervision.
翻译:旋转目标检测作为计算机视觉的一个专业子领域,在各类场景中广泛应用,尤其擅长处理任意方向的目标。相比之下,点标注将目标视为单个点,虽比旋转/水平边界框标注成本更低,但因丢失尺寸和方向信息而牺牲了性能。本研究提出P2RBox网络,利用点标注和掩码生成器创建掩码候选,通过检测模块和约束模块的筛选,选取高质量掩码并转换为旋转框标注,用于训练全监督检测器。具体而言,我们基于多实例学习原理精心设计了检测模块以评估掩码语义得分,并联合约束模块提出了更鲁棒的掩码质量评估方法。此外,受对称矩阵谱定理启发,我们提出对称轴估计(SAE)模块,将最优掩码候选转换为旋转边界框。P2RBox在三种全监督旋转目标检测器(RetinaNet、Rotated FCOS和Oriented R-CNN)上均表现优异。结合Oriented R-CNN后,P2RBox在DOTA-v1.0测试集上达到62.26%的精度。据我们所知,这是首次尝试通过点监督训练旋转目标检测器。