Recently, there has been an increased interest in the practical problem of learning multiple dense scene understanding tasks from partially annotated data, where each training sample is only labeled for a subset of the tasks. The missing of task labels in training leads to low-quality and noisy predictions, as can be observed from state-of-the-art methods. To tackle this issue, we reformulate the partially-labeled multi-task dense prediction as a pixel-level denoising problem, and propose a novel multi-task denoising diffusion framework coined as DiffusionMTL. It designs a joint diffusion and denoising paradigm to model a potential noisy distribution in the task prediction or feature maps and generate rectified outputs for different tasks. To exploit multi-task consistency in denoising, we further introduce a Multi-Task Conditioning strategy, which can implicitly utilize the complementary nature of the tasks to help learn the unlabeled tasks, leading to an improvement in the denoising performance of the different tasks. Extensive quantitative and qualitative experiments demonstrate that the proposed multi-task denoising diffusion model can significantly improve multi-task prediction maps, and outperform the state-of-the-art methods on three challenging multi-task benchmarks, under two different partial-labeling evaluation settings. The code is available at https://prismformore.github.io/diffusionmtl/.
翻译:最近,从部分标注数据中学习多个密集场景理解任务这一实际问题引起了广泛关注,其中每个训练样本仅对任务子集进行标注。训练中任务标签的缺失会导致低质量且带有噪声的预测,这可以从现有最优方法中观察到。为解决这一问题,我们将部分标注的多任务密集预测重新定义为像素级去噪问题,并提出了一种新颖的多任务去噪扩散框架,命名为DiffusionMTL。该框架设计了一种联合扩散与去噪范式,用于建模任务预测或特征图中潜在的有噪声分布,并为不同任务生成校正后的输出。为在去噪过程中利用多任务一致性,我们进一步引入了多任务条件策略,该策略能隐式利用任务的互补性帮助学习未标注任务,从而提升不同任务的去噪性能。大量的定量与定性实验表明,所提出的多任务去噪扩散模型能显著改善多任务预测图,并在三个具有挑战性的多任务基准测试中,在两种不同的部分标注评估设置下均优于现有最优方法。代码见https://prismformore.github.io/diffusionmtl/。