Effective editing of personal content holds a pivotal role in enabling individuals to express their creativity, weaving captivating narratives within their visual stories, and elevate the overall quality and impact of their visual content. Therefore, in this work, we introduce SwapAnything, a novel framework that can swap any objects in an image with personalized concepts given by the reference, while keeping the context unchanged. Compared with existing methods for personalized subject swapping, SwapAnything has three unique advantages: (1) precise control of arbitrary objects and parts rather than the main subject, (2) more faithful preservation of context pixels, (3) better adaptation of the personalized concept to the image. First, we propose targeted variable swapping to apply region control over latent feature maps and swap masked variables for faithful context preservation and initial semantic concept swapping. Then, we introduce appearance adaptation, to seamlessly adapt the semantic concept into the original image in terms of target location, shape, style, and content during the image generation process. Extensive results on both human and automatic evaluation demonstrate significant improvements of our approach over baseline methods on personalized swapping. Furthermore, SwapAnything shows its precise and faithful swapping abilities across single object, multiple objects, partial object, and cross-domain swapping tasks. SwapAnything also achieves great performance on text-based swapping and tasks beyond swapping such as object insertion.
翻译:有效的个人内容编辑对于激发个体创造力、在视觉叙事中编织引人入胜的故事、以及提升视觉内容的整体质量和影响力具有关键作用。为此,本文提出SwapAnything——一种创新框架,可在保持图像背景不变的前提下,将图像中任意对象替换为参考图像所给定的个性化概念。与现有个性化主体交换方法相比,SwapAnything具有三大独特优势:(1)可精准控制任意对象及局部区域而非仅限主体;(2)更忠实地保留上下文像素;(3)个性化概念与图像适配性更佳。首先,我们提出目标变量交换方法,通过对潜在特征图施加区域控制并交换掩码变量,实现上下文保真与初始语义概念交换。随后提出外观自适应机制,在图像生成过程中将语义概念无缝适配至原始图像的目标位置、形状、风格及内容。大量人工与自动评估结果表明,本方法在个性化交换任务上显著优于基线方法。此外,SwapAnything在单对象、多对象、部分对象及跨域交换任务中均展现出精准且保真的交换能力,在基于文本的交换以及物体插入等超越传统交换范畴的任务中同样表现优异。