Oriented object detection has been rapidly developed in the past few years, but most of these methods assume the training and testing images are under the same statistical distribution, which is far from reality. In this paper, we propose the task of domain generalized oriented object detection, which intends to explore the generalization of oriented object detectors on arbitrary unseen target domains. Learning domain generalized oriented object detectors is particularly challenging, as the cross-domain style variation not only negatively impacts the content representation, but also leads to unreliable orientation predictions. To address these challenges, we propose a generalized oriented object detector (GOOD). After style hallucination by the emerging contrastive language-image pre-training (CLIP), it consists of two key components, namely, rotation-aware content consistency learning (RAC) and style consistency learning (SEC). The proposed RAC allows the oriented object detector to learn stable orientation representation from style-diversified samples. The proposed SEC further stabilizes the generalization ability of content representation from different image styles. Extensive experiments on multiple cross-domain settings show the state-of-the-art performance of GOOD. Source code will be publicly available.
翻译:旋转目标检测在过去几年得到了快速发展,但大多数方法假设训练图像和测试图像具有相同的统计分布,这与实际情况相去甚远。本文提出了域泛化旋转目标检测任务,旨在探索旋转目标检测器在任意未见目标域上的泛化能力。学习域泛化旋转目标检测器极具挑战性,因为跨域风格变化不仅会负面影响内容表示,还会导致不可靠的朝向预测。为应对这些挑战,我们提出了一种泛化旋转目标检测器(GOOD)。在通过新兴的对比语言-图像预训练(CLIP)进行风格幻化后,它包含两个关键组件:旋转感知内容一致性学习(RAC)和风格一致性学习(SEC)。所提出的RAC使旋转目标检测器能够从风格多样化的样本中学习稳定的朝向表示;所提出的SEC进一步稳定了不同图像风格下内容表示的泛化能力。在多个跨域设置上的大量实验表明,GOOD达到了最先进的性能。源代码将公开提供。