Rotated bounding boxes drastically reduce output ambiguity of elongated objects, making it superior to axis-aligned bounding boxes. Despite the effectiveness, rotated detectors are not widely employed. Annotating rotated bounding boxes is such a laborious process that they are not provided in many detection datasets where axis-aligned annotations are used instead. In this paper, we propose a framework that allows the model to predict precise rotated boxes only requiring cheaper axis-aligned annotation of the target dataset 1. To achieve this, we leverage the fact that neural networks are capable of learning richer representation of the target domain than what is utilized by the task. The under-utilized representation can be exploited to address a more detailed task. Our framework combines task knowledge of an out-of-domain source dataset with stronger annotation and domain knowledge of the target dataset with weaker annotation. A novel assignment process and projection loss are used to enable the co-training on the source and target datasets. As a result, the model is able to solve the more detailed task in the target domain, without additional computation overhead during inference. We extensively evaluate the method on various target datasets including fresh-produce dataset, HRSC2016 and SSDD. Results show that the proposed method consistently performs on par with the fully supervised approach.
翻译:旋转边界框显著减少了细长物体的输出歧义性,使其优于轴对齐边界框。尽管效果显著,但旋转检测器并未得到广泛应用。标注旋转边界框是一项劳动密集型工作,因此许多使用轴对齐标注的数据集并未提供此类标注。本文提出一种框架,使模型仅需目标数据集上更廉价的轴对齐标注即可预测精确的旋转框。为此,我们利用神经网络能够学习比任务所需更丰富的目标域表示这一特性,将未充分利用的表示用于解决更精细的任务。该框架结合了具有更强标注的跨领域源数据集的任务知识,以及具有较弱标注的目标数据集的领域知识。通过新颖的分配过程和投影损失,实现了源数据集与目标数据集的联合训练。最终,模型能够在目标领域解决更精细的任务,且推理时无需额外计算开销。我们在包含生鲜农产品数据集、HRSC2016和SSDD的多类目标数据集上进行了充分评估,结果表明所提方法始终与全监督方法性能相当。