Affective Image Manipulation (AIM) aims to alter an image's emotional impact by adjusting multiple visual elements to evoke specific feelings.Effective AIM is inherently complex, necessitating a collaborative approach that involves identifying semantic cues within source images, manipulating these elements to elicit desired emotional responses, and verifying that the combined adjustments successfully evoke the target emotion.To address these challenges, we introduce EmoAgent, the first multi-agent collaboration framework for AIM. By emulating the cognitive behaviors of a human painter, EmoAgent incorporates three specialized agents responsible for planning, editing, and critical evaluation. Furthermore, we develop an emotion-factor knowledge retriever, a decision-making tree space, and a tool library to enhance EmoAgent's effectiveness in handling AIM. Experiments demonstrate that the proposed multi-agent framework outperforms existing methods, offering more reasonable and effective emotional expression.
翻译:情感图像编辑(AIM)旨在通过调整图像的多种视觉元素以唤起特定情感,从而改变图像的情感影响力。有效的AIM本质上具有复杂性,需要一种协作式方法,涉及识别源图像中的语义线索、操控这些元素以引发期望的情感反应,并验证组合调整是否成功唤起目标情感。为应对这些挑战,我们提出了EmoAgent,这是首个面向AIM的多智能体协作框架。通过模拟人类画家的认知行为,EmoAgent整合了三个分别负责规划、编辑与批判性评估的专用智能体。此外,我们开发了情感因子知识检索器、决策树空间及工具库,以增强EmoAgent处理AIM任务的有效性。实验表明,所提出的多智能体框架优于现有方法,能够实现更合理且有效的情感表达。