Scene Aware Person Image Generation through Global Contextual Conditioning

Person image generation is an intriguing yet challenging problem. However, this task becomes even more difficult under constrained situations. In this work, we propose a novel pipeline to generate and insert contextually relevant person images into an existing scene while preserving the global semantics. More specifically, we aim to insert a person such that the location, pose, and scale of the person being inserted blends in with the existing persons in the scene. Our method uses three individual networks in a sequential pipeline. At first, we predict the potential location and the skeletal structure of the new person by conditioning a Wasserstein Generative Adversarial Network (WGAN) on the existing human skeletons present in the scene. Next, the predicted skeleton is refined through a shallow linear network to achieve higher structural accuracy in the generated image. Finally, the target image is generated from the refined skeleton using another generative network conditioned on a given image of the target person. In our experiments, we achieve high-resolution photo-realistic generation results while preserving the general context of the scene. We conclude our paper with multiple qualitative and quantitative benchmarks on the results.

翻译：人物图像生成是一个引人入胜且具有挑战性的问题。然而，在受限情境下，这一任务变得更加困难。在本工作中，我们提出了一种新颖的流水线，用于生成并将上下文相关的人物图像插入到现有场景中，同时保持全局语义。具体而言，我们的目标是插入一个人物，使其位置、姿态和尺度与场景中已有的人物自然融合。我们的方法在一个顺序流水线中使用了三个独立的网络。首先，我们通过将Wasserstein生成对抗网络（WGAN）以场景中现有人体骨架为条件，预测新人的潜在位置和骨骼结构。接着，通过一个浅层线性网络对预测的骨架进行细化，以提高生成图像的结构准确性。最后，利用另一个以目标人物给定图像为条件的生成网络，从细化后的骨架生成目标图像。在我们的实验中，我们实现了高分辨率、照片般真实的生成结果，同时保持了场景的整体上下文。我们在文末通过多项定性与定量基准测试对结果进行了评估。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日