CLDA-YOLO：基于视觉对比学习的域自适应YOLO检测器 (CLDA-YOLO: Visual Contrastive Learning Based Domain Adaptive YOLO Detector)

Unsupervised domain adaptive (UDA) algorithms can markedly enhance the performance of object detectors under conditions of domain shifts, thereby reducing the necessity for extensive labeling and retraining. Current domain adaptive object detection algorithms primarily cater to two-stage detectors, which tend to offer minimal improvements when directly applied to single-stage detectors such as YOLO. Intending to benefit the YOLO detector from UDA, we build a comprehensive domain adaptive architecture using a teacher-student cooperative system for the YOLO detector. In this process, we propose uncertainty learning to cope with pseudo-labeling generated by the teacher model with extreme uncertainty and leverage dynamic data augmentation to asymptotically adapt the teacher-student system to the environment. To address the inability of single-stage object detectors to align at multiple stages, we utilize a unified visual contrastive learning paradigm that aligns instance at backbone and head respectively, which steadily improves the robustness of the detectors in cross-domain tasks. In summary, we present an unsupervised domain adaptive YOLO detector based on visual contrastive learning (CLDA-YOLO), which achieves highly competitive results across multiple domain adaptive datasets without any reduction in inference speed.

翻译：无监督域自适应算法能够显著提升目标检测器在域偏移条件下的性能，从而减少大量标注与重新训练的需求。当前的域自适应目标检测算法主要面向两阶段检测器，若直接应用于YOLO等单阶段检测器，其提升效果往往有限。为使YOLO检测器受益于无监督域自适应，我们构建了一个基于师生协作系统的完整域自适应架构。在此过程中，我们提出不确定性学习以应对教师模型生成的高不确定性伪标签，并利用动态数据增强使师生系统逐步适应目标环境。针对单阶段目标检测器无法进行多阶段对齐的问题，我们采用统一的视觉对比学习范式，分别在骨干网络与检测头层面对实例进行对齐，从而稳步提升检测器在跨域任务中的鲁棒性。综上所述，我们提出了一种基于视觉对比学习的无监督域自适应YOLO检测器，该模型在多个域自适应数据集上取得了极具竞争力的结果，且推理速度未受任何影响。

相关内容

Yolo

关注 28

Yolo算法，其全称是You Only Look Once: Unified, Real-Time Object Detection,You Only Look Once说的是只需要一次CNN运算，Unified指的是这是一个统一的框架，提供end-to-end的预测，而Real-Time体现是Yolo算法速度快。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日