Erase inpainting, or object removal, aims to precisely remove target objects within masked regions while preserving the overall consistency of the surrounding content. Despite diffusion-based methods have made significant strides in the field of image inpainting, challenges remain regarding the emergence of unexpected objects or artifacts. We assert that the inexact diffusion pathways established by existing standard optimization paradigms constrain the efficacy of object removal. To tackle these challenges, we propose a novel Erase Diffusion, termed EraDiff, aimed at unleashing the potential power of standard diffusion in the context of object removal. In contrast to standard diffusion, the EraDiff adapts both the optimization paradigm and the network to improve the coherence and elimination of the erasure results. We first introduce a Chain-Rectifying Optimization (CRO) paradigm, a sophisticated diffusion process specifically designed to align with the objectives of erasure. This paradigm establishes innovative diffusion transition pathways that simulate the gradual elimination of objects during optimization, allowing the model to accurately capture the intent of object removal. Furthermore, to mitigate deviations caused by artifacts during the sampling pathways, we develop a simple yet effective Self-Rectifying Attention (SRA) mechanism. The SRA calibrates the sampling pathways by altering self-attention activation, allowing the model to effectively bypass artifacts while further enhancing the coherence of the generated content. With this design, our proposed EraDiff achieves state-of-the-art performance on the OpenImages V5 dataset and demonstrates significant superiority in real-world scenarios.
翻译:擦除修复(或称物体移除)旨在精确移除掩码区域内的目标物体,同时保持周围内容的整体一致性。尽管基于扩散的方法在图像修复领域已取得显著进展,但在处理意外物体或伪影的出现方面仍面临挑战。我们认为,现有标准优化范式所建立的不精确扩散路径限制了物体移除的效果。为应对这些挑战,我们提出了一种新颖的擦除扩散方法(称为EraDiff),旨在释放标准扩散在物体移除场景中的潜在能力。与标准扩散相比,EraDiff同时优化了范式与网络架构,以提升擦除结果的连贯性与消除效果。我们首先引入了链式校正优化(CRO)范式,这是一种专门为擦除目标设计的精细化扩散过程。该范式建立了创新的扩散转移路径,模拟优化过程中物体的渐进消除,使模型能够准确捕捉物体移除的意图。此外,为减轻采样路径中伪影引起的偏差,我们开发了一种简单而有效的自校正注意力(SRA)机制。SRA通过改变自注意力激活来校准采样路径,使模型能够有效规避伪影,同时进一步增强生成内容的连贯性。通过此设计,我们提出的EraDiff在OpenImages V5数据集上实现了最先进的性能,并在实际场景中展现出显著优势。