Instance segmentation datasets play a crucial role in training accurate and robust computer vision models. However, obtaining accurate mask annotations to produce high-quality segmentation datasets is a costly and labor-intensive process. In this work, we show how this issue can be mitigated by starting with small annotated instance segmentation datasets and augmenting them to effectively obtain a sizeable annotated dataset. We achieve that by creating variations of the available annotated object instances in a way that preserves the provided mask annotations, thereby resulting in new image-mask pairs to be added to the set of annotated images. Specifically, we generate new images using a diffusion-based inpainting model to fill out the masked area with a desired object class by guiding the diffusion through the object outline. We show that the object outline provides a simple, but also reliable and convenient training-free guidance signal for the underlying inpainting model that is often sufficient to fill out the mask with an object of the correct class without further text guidance and preserve the correspondence between generated images and the mask annotations with high precision. Our experimental results reveal that our method successfully generates realistic variations of object instances, preserving their shape characteristics while introducing diversity within the augmented area. We also show that the proposed method can naturally be combined with text guidance and other image augmentation techniques.
翻译:实例分割数据集在训练准确且鲁棒的计算机视觉模型中起着关键作用。然而,获取高质量的实例分割数据集需要精确的掩码标注,这一过程成本高昂且劳动密集。本研究展示如何通过从小规模标注实例分割数据集出发,进行有效扩充来获得大规模标注数据集。具体而言,我们通过保留现有掩码标注的方式,对已标注物体实例创建多种变体,从而生成新的图像-掩码对并加入标注图像集。我们利用基于扩散的修复模型生成新图像,通过物体轮廓引导扩散过程,使掩码区域填充所需物体类别。研究表明,物体轮廓为底层修复模型提供了简单、可靠且便捷的无训练引导信号,通常足以在没有额外文本引导的情况下,使掩码区域正确填充目标类别,并精确保持生成图像与掩码标注之间的对应关系。实验结果表明,该方法成功生成物体实例的真实变体,在保留形状特征的同时为扩充区域引入多样性。此外,该方案可自然兼容文本引导与其他图像增强技术。