Image inpainting, the process of restoring corrupted images, has seen significant advancements with the advent of diffusion models (DMs). Despite these advancements, current DM adaptations for inpainting, which involve modifications to the sampling strategy or the development of inpainting-specific DMs, frequently suffer from semantic inconsistencies and reduced image quality. Addressing these challenges, our work introduces a novel paradigm: the division of masked image features and noisy latent into separate branches. This division dramatically diminishes the model's learning load, facilitating a nuanced incorporation of essential masked image information in a hierarchical fashion. Herein, we present BrushNet, a novel plug-and-play dual-branch model engineered to embed pixel-level masked image features into any pre-trained DM, guaranteeing coherent and enhanced image inpainting outcomes. Additionally, we introduce BrushData and BrushBench to facilitate segmentation-based inpainting training and performance assessment. Our extensive experimental analysis demonstrates BrushNet's superior performance over existing models across seven key metrics, including image quality, mask region preservation, and textual coherence.
翻译:图像修复,即修复受损图像的过程,随着扩散模型的出现取得了显著进展。尽管取得了这些进展,当前针对修复任务的扩散模型适配方法,包括对采样策略的修改或开发专用修复扩散模型,常常面临语义不一致和图像质量下降的问题。为解决这些挑战,我们提出了一种新范式:将掩膜图像特征与噪声潜变量分离到不同分支中。这种分离极大降低了模型的学习负担,以分层方式促进了对关键掩膜图像信息的精细整合。在此,我们提出BrushNet,一种新颖的可插拔双分支模型,旨在将像素级掩膜图像特征嵌入到任意预训练扩散模型中,确保连贯且增强的图像修复效果。此外,我们引入了BrushData和BrushBench,以支持基于分割的图像修复训练与性能评估。我们的广泛实验分析表明,BrushNet在图像质量、掩膜区域保持和文本一致性等七个关键指标上优于现有模型。