Robust autonomous driving requires agents to accurately identify unexpected areas in urban scenes. To this end, some critical issues remain open: how to design advisable metric to measure anomalies, and how to properly generate training samples of anomaly data? Previous effort usually resorts to uncertainty estimation and sample synthesis from classification tasks, which ignore the context information and sometimes requires auxiliary datasets with fine-grained annotations. On the contrary, in this paper, we exploit the strong context-dependent nature of segmentation task and design an energy-guided self-supervised frameworks for anomaly segmentation, which optimizes an anomaly head by maximizing the likelihood of self-generated anomaly pixels. To this end, we design two estimators for anomaly likelihood estimation, one is a simple task-agnostic binary estimator and the other depicts anomaly likelihood as residual of task-oriented energy model. Based on proposed estimators, we further incorporate our framework with likelihood-guided mask refinement process to extract informative anomaly pixels for model training. We conduct extensive experiments on challenging Fishyscapes and Road Anomaly benchmarks, demonstrating that without any auxiliary data or synthetic models, our method can still achieves competitive performance to other SOTA schemes.
翻译:鲁棒的自动驾驶要求智能体在城市场景中准确识别非预期区域。为此,一些关键问题仍待解决:如何设计合理的度量来评估异常,以及如何恰当地生成异常数据的训练样本?以往研究通常依赖不确定性估计和基于分类任务的样本合成,这忽略了上下文信息,有时还需要带有细粒度标注的辅助数据集。与此相反,本文利用分割任务中强烈的上下文依赖性,设计了一种能量引导的自监督框架用于异常分割,通过最大化自生成异常像素的似然来优化异常检测头。为此,我们设计了两种异常似然估计器:一种是简单的任务无关二值估计器,另一种将异常似然描述为任务导向能量模型的残差。基于所提出的估计器,我们进一步将框架与似然引导的掩码精化过程相结合,以提取信息丰富的异常像素用于模型训练。我们在具有挑战性的Fishyscapes和Road Anomaly基准上进行了大量实验,结果表明,无需任何辅助数据或合成模型,我们的方法仍能达到与其他最先进方案相竞争的性能。