The rise of machine learning in recent years has brought benefits to various research fields such as wide fire detection. Nevertheless, small object detection and rare object detection remain a challenge. To address this problem, we present a dataset automata that can generate ground truth paired datasets using diffusion models. Specifically, we introduce a mask-guided diffusion framework that can fusion the wildfire into the existing images while the flame position and size can be precisely controlled. In advance, to fill the gap that the dataset of wildfire images in specific scenarios is missing, we vary the background of synthesized images by controlling both the text prompt and input image. Furthermore, to solve the color tint problem or the well-known domain shift issue, we apply the CLIP model to filter the generated massive dataset to preserve quality. Thus, our proposed framework can generate a massive dataset of that images are high-quality and ground truth-paired, which well addresses the needs of the annotated datasets in specific tasks.
翻译:近年来机器学习的兴起为野火检测等研究领域带来了诸多益处。然而,小目标检测和罕见目标检测仍然是挑战。为解决该问题,我们提出一种数据集生成框架,可利用扩散模型生成具有真实标注的配对数据集。具体而言,我们引入一种掩码引导扩散框架,能够将野火融合至现有图像中,同时精确控制火焰位置与尺寸。进一步地,为弥合特定场景野火图像数据集缺失的空白,我们通过控制文本提示与输入图像来改变合成图像的背景。此外,为解决色偏问题或已知的域偏移问题,我们应用CLIP模型对生成的大规模数据集进行筛选以保证质量。因此,我们所提出的框架能够生成高质量且具有真实标注配对的大规模数据集,充分满足特定任务对标注数据集的需求。