Rotated bounding boxes drastically reduce output ambiguity of elongated objects, making it superior to axis-aligned bounding boxes. Despite the effectiveness, rotated detectors are not widely employed. Annotating rotated bounding boxes is such a laborious process that they are not provided in many detection datasets where axis-aligned annotations are used instead. In this paper, we propose a framework that allows the model to predict precise rotated boxes only requiring cheaper axis-aligned annotation of the target dataset 1. To achieve this, we leverage the fact that neural networks are capable of learning richer representation of the target domain than what is utilized by the task. The under-utilized representation can be exploited to address a more detailed task. Our framework combines task knowledge of an out-of-domain source dataset with stronger annotation and domain knowledge of the target dataset with weaker annotation. A novel assignment process and projection loss are used to enable the co-training on the source and target datasets. As a result, the model is able to solve the more detailed task in the target domain, without additional computation overhead during inference. We extensively evaluate the method on various target datasets including fresh-produce dataset, HRSC2016 and SSDD. Results show that the proposed method consistently performs on par with the fully supervised approach.
翻译:旋转边界框显著减少了细长物体的输出歧义,使其优于轴对齐边界框。尽管效果显著,旋转检测器并未得到广泛应用。标注旋转边界框过程费力,导致许多检测数据集仅提供轴对齐标注而非旋转框。本文提出一种框架,使模型仅需目标数据集的低成本轴对齐标注即可预测精确旋转框。为此,我们利用了神经网络能够学习比任务所需更丰富的目标域表示这一特性,未被充分利用的表示可被用于解决更精细的任务。本框架结合了带强标注的域外源数据集的任务知识与带弱标注的目标数据集的域知识,通过新颖的分配流程和投影损失实现源数据集与目标数据集的联合训练。最终,模型能在目标域解决更精细的任务,无需在推理时增加额外计算开销。我们在包括生鲜数据集、HRSC2016和SSDD在内的多个目标数据集上进行了广泛评估,结果表明所提方法性能始终与全监督方法相当。