We present PersonNeRF, a method that takes a collection of photos of a subject (e.g. Roger Federer) captured across multiple years with arbitrary body poses and appearances, and enables rendering the subject with arbitrary novel combinations of viewpoint, body pose, and appearance. PersonNeRF builds a customized neural volumetric 3D model of the subject that is able to render an entire space spanned by camera viewpoint, body pose, and appearance. A central challenge in this task is dealing with sparse observations; a given body pose is likely only observed by a single viewpoint with a single appearance, and a given appearance is only observed under a handful of different body poses. We address this issue by recovering a canonical T-pose neural volumetric representation of the subject that allows for changing appearance across different observations, but uses a shared pose-dependent motion field across all observations. We demonstrate that this approach, along with regularization of the recovered volumetric geometry to encourage smoothness, is able to recover a model that renders compelling images from novel combinations of viewpoint, pose, and appearance from these challenging unstructured photo collections, outperforming prior work for free-viewpoint human rendering.
翻译:我们提出PersonNeRF方法,该方法能够利用某一对象(如罗杰·费德勒)多年间在不同身体姿态和外观下拍摄的照片集合,实现对该对象在任意视角、身体姿态和外观组合下的渲染。PersonNeRF构建了该对象的定制化神经体积三维模型,能够渲染由相机视角、身体姿态和外观张成的整个空间。该任务的核心挑战在于应对稀疏观测:给定的身体姿态可能仅被单一视角和单一外观观测到,而给定的外观也仅在少数不同身体姿态下被观测。为解决这一问题,我们通过恢复对象的规范T姿态神经体积表示,该表示允许在不同观测间改变外观,但所有观测共享同一个基于姿态的运动场。我们证明,该方法结合对恢复的体积几何进行平滑正则化处理,能够从这些具有挑战性的非结构化照片集合中恢复出模型,生成视角、姿态和外观的任意新颖组合下的逼真图像,在自由视角人体渲染方面优于现有工作。