We present a new approach for synthesizing novel views of people in new poses. Our novel differentiable renderer enables the synthesis of highly realistic images from any viewpoint. Rather than operating over mesh-based structures, our renderer makes use of diffuse Gaussian primitives that directly represent the underlying skeletal structure of a human. Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network. The formulation gives rise to a fully differentiable framework that can be trained end-to-end. We demonstrate the effectiveness of our approach to image reconstruction on both the Human3.6M and Panoptic Studio datasets. We show how our approach can be used for motion transfer between individuals; novel view synthesis of individuals captured from just a single camera; to synthesize individuals from any virtual viewpoint; and to re-render people in novel poses. Code and video results are available at https://github.com/GuillaumeRochette/HumanViewSynthesis.
翻译:我们提出一种新方法,用于合成人物在新姿态下的新视角图像。我们的新型可微分渲染器能够从任意视角生成高度逼真的图像。该渲染器不依赖于网格结构,而是采用弥散高斯基元直接表示人体的底层骨骼结构。对这些基元进行渲染可得到高维潜像,随后由解码器网络将其转换为RGB图像。该公式构建了一个可端到端训练的完全可微分框架。我们在Human3.6M和Panoptic Studio数据集上验证了该方法在图像重建中的有效性,展示了其多项应用:个体间的动作迁移、仅通过单摄像头捕捉的个体新视角合成、从任意虚拟视角合成个体,以及以新姿态重新渲染人物。代码与视频结果详见:https://github.com/GuillaumeRochette/HumanViewSynthesis。