For object detection, it is possible to view the prediction of bounding boxes as a reverse diffusion process. Using a diffusion model, the random bounding boxes are iteratively refined in a denoising step, conditioned on the image. We propose a stochastic accumulator function that starts each run with random bounding boxes and combines the slightly different predictions. We empirically verify that this improves detection performance. The improved detections are leveraged on unlabelled images as weighted pseudo-labels for semi-supervised learning. We evaluate the method on a challenging out-of-domain test set. Our method brings significant improvements and is on par with human-selected pseudo-labels, while not requiring any human involvement.
翻译:对于目标检测任务,可将边界框预测视为逆向扩散过程。通过扩散模型,以图像为条件,在去噪步骤中迭代优化随机生成的边界框。我们提出一种随机累加函数,该函数以随机边界框启动每次运行,并整合略微不同的预测结果。实验验证表明,该方法能提升检测性能。将改进后的检测结果作为加权伪标签应用于无标注图像,用于半监督学习。我们在具有挑战性的跨域测试集上评估该方法。该方法在无需任何人工参与的情况下带来显著性能提升,其表现与人工筛选的伪标签相当。