Most dense recognition approaches bring a separate decision in each particular pixel. These approaches deliver competitive performance in usual closed-set setups. However, important applications in the wild typically require strong performance in presence of outliers. We show that this demanding setup greatly benefit from mask-level predictions, even in the case of non-finetuned baseline models. Moreover, we propose an alternative formulation of dense recognition uncertainty that effectively reduces false positive responses at semantic borders. The proposed formulation produces a further improvement over a very strong baseline and sets the new state of the art in outlier-aware semantic segmentation with and without training on negative data. Our contributions also lead to performance improvement in a recent panoptic setup. In-depth experiments confirm that our approach succeeds due to implicit aggregation of pixel-level cues into mask-level predictions.
翻译:大多数密集识别方法在每个特定像素中引入独立决策。这些方法在常规封闭集设定中表现出竞争力。然而,实际中的重要应用通常需要在存在异常值时具备强大性能。我们表明,即使是未微调的基线模型,该高要求设定也能极大受益于掩码级预测。此外,我们提出了一种密集识别不确定性的替代公式,该公式有效降低了语义边界处的误报响应。所提公式在极强的基线上实现了进一步改进,并在基于负数据训练与否的异常感知语义分割中均达到了新最优水平。我们的贡献还带来了近期全景分割设定中的性能提升。深入实验证实,我们的方法通过将像素级线索隐式聚合为掩码级预测而取得成功。