This paper introduces the point-axis representation for oriented object detection, emphasizing its flexibility and geometrically intuitive nature with two key components: points and axes. 1) Points delineate the spatial extent and contours of objects, providing detailed shape descriptions. 2) Axes define the primary directionalities of objects, providing essential orientation cues crucial for precise detection. The point-axis representation decouples location and rotation, addressing the loss discontinuity issues commonly encountered in traditional bounding box-based approaches. For effective optimization without introducing additional annotations, we propose the max-projection loss to supervise point set learning and the cross-axis loss for robust axis representation learning. Further, leveraging this representation, we present the Oriented DETR model, seamlessly integrating the DETR framework for precise point-axis prediction and end-to-end detection. Experimental results demonstrate significant performance improvements in oriented object detection tasks.
翻译:本文提出了一种用于定向目标检测的点轴表示方法,该方法通过点和轴两个核心组件强调了其灵活性和几何直观性。1) 点用于描绘目标的空间范围和轮廓,提供详细的形状描述。2) 轴用于定义目标的主要方向性,提供对精确检测至关重要的方向线索。点轴表示解耦了位置和旋转,解决了传统基于边界框方法中常见的损失不连续性问题。为了在不引入额外标注的情况下实现有效优化,我们提出了最大投影损失来监督点集学习,以及交叉轴损失用于鲁棒的轴表示学习。进一步地,利用此表示,我们提出了Oriented DETR模型,该模型无缝集成了DETR框架,用于精确的点轴预测和端到端检测。实验结果表明,该方法在定向目标检测任务中实现了显著的性能提升。