Image composition refers to inserting a foreground object into a background image to obtain a composite image. In this work, we focus on generating plausible shadows for the inserted foreground object to make the composite image more realistic. To supplement the existing small-scale dataset, we create a large-scale dataset called RdSOBA with rendering techniques. Moreover, we design a two-stage network named DMASNet with decomposed mask prediction and attentive shadow filling. Specifically, in the first stage, we decompose shadow mask prediction into box prediction and shape prediction. In the second stage, we attend to reference background shadow pixels to fill the foreground shadow. Abundant experiments prove that our DMASNet achieves better visual effects and generalizes well to real composite images.
翻译:图像合成是指将前景对象插入背景图像中,从而获得合成图像。本文聚焦于为插入的前景对象生成合理的阴影,以提升合成图像的真实感。为补充现有小规模数据集,我们利用渲染技术构建了名为RdSOBA的大规模数据集。此外,我们设计了一个名为DMASNet的两阶段网络,该网络采用解耦掩码预测与注意力阴影填充。具体而言,在第一阶段,我们将阴影掩码预测分解为框预测与形状预测;在第二阶段,我们通过注意力机制关注参考背景阴影像素,以填充前景阴影。大量实验证明,我们的DMASNet能实现更优的视觉效果,并对真实合成图像具有良好的泛化能力。