In this paper, we present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks. To achieve globally coherent reshaping effects, our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image and then reshapes the fitted 3D model with respect to user-specified semantic attributes. Previous methods rely on image warping to transfer 3D reshaping effects to the entire image domain and thus often cause distortions in both foreground and background. In contrast, we resort to generative adversarial nets conditioned on the source image and a 2D warping field induced by the reshaped 3D model, to achieve more realistic reshaping results. Specifically, we separately encode the foreground and background information in the source image using a two-headed UNet-like generator, and guide the information flow from the foreground branch to the background branch via feature space warping. Furthermore, to deal with the lack-of-data problem that no paired data exist (i.e., the same human bodies in varying shapes), we introduce a novel self-supervised strategy to train our network. Unlike previous methods that often require manual efforts to correct undesirable artifacts caused by incorrect body-to-image fitting, our method is fully automatic. Extensive experiments on both indoor and outdoor datasets demonstrate the superiority of our method over previous approaches.
翻译:本文提出NeuralReshaper,一种利用深度生成网络对单幅图像中人体进行语义重塑的新方法。为实现全局一致的修饰效果,我们的方法遵循"拟合-重塑"流程:首先将参数化3D人体模型拟合至源人体图像,随后根据用户指定的语义属性对拟合后的3D模型进行重塑。先前方法依赖图像变形将3D重塑效果传递至整个图像域,常导致前景与背景的失真。相比之下,我们采用以源图像及重塑3D模型生成的二维变形场为条件的生成对抗网络,以获得更逼真的重塑结果。具体而言,我们使用双头UNet类生成器分别编码源图像的前景与背景信息,并通过特征空间变形引导信息从前景分支流向背景分支。此外,为解决数据缺乏问题(即不存在同一人体不同体型的配对数据),我们提出一种新颖的自监督训练策略。与先前需人工修正因人体-图像拟合错误导致的伪影的方法不同,本方法完全自动化。在室内外数据集上的大量实验表明,本方法优于先前方法。