Making Images Real Again: A Comprehensive Survey on Deep Image Composition

As a common image editing operation, image composition aims to combine the foreground from one image and another background image, resulting in a composite image. However, there are many issues that could make the composite images unrealistic. These issues can be summarized as the inconsistency between foreground and background, which includes appearance inconsistency (e.g., incompatible illumination), geometry inconsistency (e.g., unreasonable size), and semantic inconsistency (e.g., mismatched semantic context). Image composition task could be decomposed into multiple sub-tasks, in which each sub-task targets at one or more issues. Specifically, object placement aims to find reasonable scale, location, and shape for the foreground. Image blending aims to address the unnatural boundary between foreground and background. Image harmonization aims to adjust the illumination statistics of foreground. Shadow generation aims to generate plausible shadow for the foreground. These sub-tasks can be executed sequentially or parallelly to acquire realistic composite images. To the best of our knowledge, there is no previous survey on image composition. In this paper, we conduct comprehensive survey over the sub-tasks and combinatorial task of image composition. For each one, we summarize the existing methods, available datasets, and common evaluation metrics. Datasets and codes for image composition are summarized at https://github.com/bcmi/Awesome-Image-Composition. We have also contributed the first image composition toolbox: libcom https://github.com/bcmi/libcom, which assembles 10+ image composition related functions (e.g., image blending, image harmonization, object placement, shadow generation, generative composition). The ultimate goal of this toolbox is solving all the problems related to image composition with simple `import libcom'.

翻译：作为一种常见的图像编辑操作，图像合成旨在将一个图像的前景与另一图像的背景相结合，生成合成图像。然而，存在诸多问题可能导致合成图像不真实。这些问题可归结为前景与背景之间的不一致性，包括外观不一致性（如光照不协调）、几何不一致性（如尺寸不合理）以及语义不一致性（如语义语境不匹配）。图像合成任务可分解为多个子任务，每个子任务针对一个或多个问题。具体而言，物体放置旨在为前景找到合理的尺度、位置和形状；图像融合旨在解决前景与背景之间的不自然边界；图像协调旨在调整前景的光照统计特性；阴影生成旨在为前景生成合理的阴影。这些子任务可顺序或并行执行，以获取真实的合成图像。据我们所知，目前尚无关于图像合成的综述。本文对图像合成的各个子任务及组合任务进行了全面综述。针对每个任务，我们总结了现有方法、可用数据集以及常用评估指标。图像合成的数据集和代码汇总于 https://github.com/bcmi/Awesome-Image-Composition。我们还提供了首个图像合成工具箱：libcom（https://github.com/bcmi/libcom），该工具箱集成了10余种图像合成相关功能（如图像融合、图像协调、物体放置、阴影生成、生成式合成）。该工具箱的最终目标是通过简单的 `import libcom` 解决所有与图像合成相关的问题。