We present YOEO, an approach for object erasure. Unlike recent diffusion-based methods which struggle to erase target objects without generating unexpected content within the masked regions due to lack of sufficient paired training data and explicit constraint on content generation, our method allows to produce high-quality object erasure results free of unwanted objects or artifacts while faithfully preserving the overall context coherence to the surrounding content. We achieve this goal by training an object erasure diffusion model on unpaired data containing only large-scale real-world images, under the supervision of a sundries detector and a context coherence loss that are built upon an entity segmentation model. To enable more efficient training and inference, a diffusion distillation strategy is employed to train for a few-step erasure diffusion model. Extensive experiments show that our method outperforms the state-of-the-art object erasure methods. Code will be available at https://zyxunh.github.io/YOEO-ProjectPage/.
翻译:我们提出了YOEO,一种用于目标擦除的方法。与近期基于扩散的方法不同,后者因缺乏充足的成对训练数据以及对内容生成的明确约束,在擦除遮罩区域内的目标时难以避免产生意外内容,而我们的方法能够生成高质量的目标擦除结果,不仅消除多余物体或伪影,还能忠实地保持与周围内容的整体上下文连贯性。我们通过仅在包含大规模真实世界图像的非成对数据上训练目标擦除扩散模型来实现这一目标,并借助基于实体分割模型构建的杂物检测器与上下文连贯性损失进行监督。为了提升训练与推理效率,我们采用扩散蒸馏策略训练出仅需少量步骤即可完成擦除的扩散模型。大量实验表明,我们的方法优于当前最先进的目标擦除方法。代码将发布于 https://zyxunh.github.io/YOEO-ProjectPage/。