Differing from the well-developed horizontal object detection area whereby the computing-friendly IoU based loss is readily adopted and well fits with the detection metrics. In contrast, rotation detectors often involve a more complicated loss based on SkewIoU which is unfriendly to gradient-based training. In this paper, we propose an effective approximate SkewIoU loss based on Gaussian modeling and Gaussian product, which mainly consists of two items. The first term is a scale-insensitive center point loss, which is used to quickly narrow the distance between the center points of the two bounding boxes. In the distance-independent second term, the product of the Gaussian distributions is adopted to inherently mimic the mechanism of SkewIoU by its definition, and show its alignment with the SkewIoU loss at trend-level within a certain distance (i.e. within 9 pixels). This is in contrast to recent Gaussian modeling based rotation detectors e.g. GWD loss and KLD loss that involve a human-specified distribution distance metric which require additional hyperparameter tuning that vary across datasets and detectors. The resulting new loss called KFIoU loss is easier to implement and works better compared with exact SkewIoU loss, thanks to its full differentiability and ability to handle the non-overlapping cases. We further extend our technique to the 3-D case which also suffers from the same issues as 2-D. Extensive results on various public datasets (2-D/3-D, aerial/text/face images) with different base detectors show the effectiveness of our approach.
翻译:与水平目标检测领域已成熟的易于计算的IoU损失可直接采用且与检测指标良好匹配不同,旋转检测器通常涉及基于SkewIoU的复杂损失,而这种损失不利于基于梯度的训练。本文提出了一种基于高斯建模和高斯乘积的有效近似SkewIoU损失,主要由两项构成。第一项是尺度不敏感的中心点损失,用于快速缩小两个边界框中心点之间的距离。在距离无关的第二项中,利用高斯分布的乘积本质上模拟了SkewIoU定义下的机制,并展示了在特定距离(9像素内)的趋势层面上与SkewIoU损失的一致性。这与近期基于高斯建模的旋转检测器(如GWD损失和KLD损失)不同,后者采用人为指定的分布距离度量,需要根据数据集和检测器额外调整超参数。所提出的新型损失称为KFIoU损失,因其完全可微且能处理非重叠情况,相比精确SkewIoU损失更易实现且效果更优。我们进一步将该技术扩展到同样面临二维问题的三维场景。在多种公开数据集(二维/三维、航空/文本/人脸图像)上使用不同基础检测器的大量实验结果验证了本方法的有效性。