SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing - 专知论文

会员服务 ·

0

查准率/准确率 · 控制器 · Pivotal（公司） · Elevate · Better ·

2024 年 10 月 3 日

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing

翻译：SwapAnything：实现个性化视觉编辑中的任意对象替换

Jing Gu,Nanxuan Zhao,Wei Xiong,Qing Liu,Zhifei Zhang,He Zhang,Jianming Zhang,HyunJoon Jung,Yilin Wang,Xin Eric Wang

from arxiv, ECCV 2024, 23 pages, 14 figures, 3 tables

Effective editing of personal content holds a pivotal role in enabling individuals to express their creativity, weaving captivating narratives within their visual stories, and elevate the overall quality and impact of their visual content. Therefore, in this work, we introduce SwapAnything, a novel framework that can swap any objects in an image with personalized concepts given by the reference, while keeping the context unchanged. Compared with existing methods for personalized subject swapping, SwapAnything has three unique advantages: (1) precise control of arbitrary objects and parts rather than the main subject, (2) more faithful preservation of context pixels, (3) better adaptation of the personalized concept to the image. First, we propose targeted variable swapping to apply region control over latent feature maps and swap masked variables for faithful context preservation and initial semantic concept swapping. Then, we introduce appearance adaptation, to seamlessly adapt the semantic concept into the original image in terms of target location, shape, style, and content during the image generation process. Extensive results on both human and automatic evaluation demonstrate significant improvements of our approach over baseline methods on personalized swapping. Furthermore, SwapAnything shows its precise and faithful swapping abilities across single object, multiple objects, partial object, and cross-domain swapping tasks. SwapAnything also achieves great performance on text-based swapping and tasks beyond swapping such as object insertion.

翻译：有效编辑个人内容在激发个体创造力、编织视觉叙事中的引人入胜情节以及提升视觉内容整体质量与影响力方面具有关键作用。为此，本文提出SwapAnything——一种新颖的框架，能够在保持图像上下文不变的前提下，将图像中的任意对象替换为参考图像所指定的个性化概念。相较于现有的个性化主体替换方法，SwapAnything具备三大独特优势：(1) 可精确控制任意对象及局部区域而非仅限于主体目标，(2) 更忠实地保留上下文像素信息，(3) 使个性化概念更适配目标图像。首先，我们提出针对性变量交换技术，通过对隐特征图实施区域控制并交换掩码变量，实现忠实上下文保持与初始语义概念置换。随后引入外观自适应机制，在图像生成过程中依据目标位置、形状、风格与内容，将语义概念无缝适配至原始图像。大量人工评估与自动评估结果表明，本方法在个性化替换任务上较基线模型有显著提升。此外，SwapAnything在单对象替换、多对象替换、局部对象替换及跨域替换任务中均展现出精准且忠实的替换能力。该框架在基于文本的替换任务以及对象插入等非替换任务中也表现出优异性能。

0

相关内容

查准率/准确率

查准率/准确率

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

太阳活动11年周期影响春季NAO与东亚夏季风关系的过程及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation

Arxiv

0+阅读 · 2024年11月12日

SimBase: A Simple Baseline for Temporal Video Grounding

Arxiv

0+阅读 · 2024年11月12日

LEO: Generative Latent Image Animator for Human Video Synthesis

Arxiv

0+阅读 · 2024年11月12日

GraphRPM: Risk Pattern Mining on Industrial Large Attributed Graphs

Arxiv

0+阅读 · 2024年11月11日

DEIO: Deep Event Inertial Odometry

Arxiv

0+阅读 · 2024年11月9日

SiriusBI: Building End-to-End Business Intelligence Enhanced by Large Language Models

Arxiv

0+阅读 · 2024年11月9日

EUREKHA: Enhancing User Representation for Key Hackers Identification in Underground Forums

Arxiv

0+阅读 · 2024年11月8日

Video Understanding with Large Language Models: A Survey

Arxiv

13+阅读 · 2023年12月29日

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Arxiv

11+阅读 · 2022年12月1日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

VIP会员

文章信息

相关主题

查准率/准确率

Pivotal（公司）

最新内容

“史诗怒火”行动：现代多域作战的重要节点

“史诗怒火”行动：现代多域作战的重要节点

专知会员服务

5+阅读 · 今天5:05

《下一代无线网络中的多无人机通信资源管理》

《下一代无线网络中的多无人机通信资源管理》

专知会员服务

4+阅读 · 今天5:00

《高分辨率模拟下的聚合战斗建模：以“会战交锋”场景为例》

《高分辨率模拟下的聚合战斗建模：以“会战交锋”场景为例》

专知会员服务

5+阅读 · 今天4:52

《人机协同在安全关键型操作决策中的应用》120页

《人机协同在安全关键型操作决策中的应用》120页

专知会员服务

3+阅读 · 今天4:43

网络防御与空中力量网络防护：21世纪空中力量历史与理论的启示

网络防御与空中力量网络防护：21世纪空中力量历史与理论的启示

专知会员服务

3+阅读 · 今天1:47

综述 | Memory for Large Language Models：大模型记忆机制全景

综述 | Memory for Large Language Models：大模型记忆机制全景

专知会员服务

6+阅读 · 7月29日

博士论文 | Riemannian Deep Learning：模块、网络与几何

博士论文 | Riemannian Deep Learning：模块、网络与几何

专知会员服务

3+阅读 · 7月29日

《越野作战环境下路径规划的多准则整数规划模型》

《越野作战环境下路径规划的多准则整数规划模型》

专知会员服务

9+阅读 · 7月29日

人工智能大语言模型引擎如何重塑全球冲突信息环境最新50页

人工智能大语言模型引擎如何重塑全球冲突信息环境最新50页

专知会员服务

7+阅读 · 7月29日

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

专知会员服务

6+阅读 · 7月29日

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

专知会员服务

10+阅读 · 7月29日

《同步多无人机系统中的故障与通信》

《同步多无人机系统中的故障与通信》

专知会员服务

4+阅读 · 7月29日

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

5+阅读 · 7月28日

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

11+阅读 · 7月28日

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

7+阅读 · 7月28日

相关VIP内容

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《下一代无线网络中的多无人机通信资源管理》

《人机协同在安全关键型操作决策中的应用》120页

“史诗怒火”行动：现代多域作战的重要节点

《高分辨率模拟下的聚合战斗建模：以“会战交锋”场景为例》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation

Arxiv

0+阅读 · 2024年11月12日

SimBase: A Simple Baseline for Temporal Video Grounding

Arxiv

0+阅读 · 2024年11月12日

LEO: Generative Latent Image Animator for Human Video Synthesis

Arxiv

0+阅读 · 2024年11月12日

GraphRPM: Risk Pattern Mining on Industrial Large Attributed Graphs

Arxiv

0+阅读 · 2024年11月11日

DEIO: Deep Event Inertial Odometry

Arxiv

0+阅读 · 2024年11月9日

SiriusBI: Building End-to-End Business Intelligence Enhanced by Large Language Models

Arxiv

0+阅读 · 2024年11月9日

EUREKHA: Enhancing User Representation for Key Hackers Identification in Underground Forums

Arxiv

0+阅读 · 2024年11月8日

Video Understanding with Large Language Models: A Survey

Arxiv

13+阅读 · 2023年12月29日

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Arxiv

11+阅读 · 2022年12月1日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

太阳活动11年周期影响春季NAO与东亚夏季风关系的过程及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员