We propose a method that augments a simulated dataset using diffusion models to improve the performance of pedestrian detection in real-world data. The high cost of collecting and annotating data in the real-world has motivated the use of simulation platforms to create training datasets. While simulated data is inexpensive to collect and annotate, it unfortunately does not always closely match the distribution of real-world data, which is known as the sim2real gap. In this paper we propose a novel method of synthetic data creation meant to close the sim2real gap for the challenging pedestrian detection task. Our method uses a diffusion-based architecture to learn a real-world distribution which, once trained, is used to generate datasets. We mix this generated data with simulated data as a form of augmentation and show that training on a combination of generated and simulated data increases average precision by as much as 27.3% for pedestrian detection models in real-world data, compared against training on purely simulated data.
翻译:我们提出一种利用扩散模型增强仿真数据集的方法,以提升行人检测在真实数据上的性能。在现实世界中收集和标注数据的高昂成本,促使人们利用仿真平台来创建训练数据集。尽管仿真数据的采集与标注成本低廉,但其分布往往与真实数据存在差异,这被称为“仿真到真实域差距”。本文针对具有挑战性的行人检测任务,提出一种旨在缩小该差距的合成数据创建新方法。该方法采用基于扩散的架构学习真实数据分布,在训练完成后用于生成数据集。我们将生成的数据与仿真数据混合作为数据增强手段,实验表明,与仅使用仿真数据训练相比,在生成数据与仿真数据的组合上训练模型,可使真实数据上行人检测的平均精度提升高达27.3%。