This paper introduces a novel pipeline to reconstruct the geometry of interacting multi-person in clothing on a globally coherent scene space from a single image. The main challenge arises from the occlusion: a part of a human body is not visible from a single view due to the occlusion by others or the self, which introduces missing geometry and physical implausibility (e.g., penetration). We overcome this challenge by utilizing two human priors for complete 3D geometry and surface contacts. For the geometry prior, an encoder learns to regress the image of a person with missing body parts to the latent vectors; a decoder decodes these vectors to produce 3D features of the associated geometry; and an implicit network combines these features with a surface normal map to reconstruct a complete and detailed 3D humans. For the contact prior, we develop an image-space contact detector that outputs a probability distribution of surface contacts between people in 3D. We use these priors to globally refine the body poses, enabling the penetration-free and accurate reconstruction of interacting multi-person in clothing on the scene space. The results demonstrate that our method is complete, globally coherent, and physically plausible compared to existing methods.
翻译:本文提出一种新流程,用于从单张图像重建全局连贯场景空间中穿衣交互多人的几何结构。主要挑战源于遮挡:人体部分区域因他人或自身遮挡而不可见,导致几何缺失和物理不合理性(如穿透)。我们通过利用两种先验(完整三维几何先验与表面接触先验)来克服这一挑战。几何先验方面,编码器学习将存在身体部位缺失的人物图像映射至隐向量;解码器将这些向量解码生成关联几何的三维特征;隐式网络将这些特征与表面法向图结合,重建完整且细节丰富的三维人体。接触先验方面,我们开发了一种图像空间接触检测器,输出三维空间中人与人之间表面接触的概率分布。利用这些先验全局优化人体姿态,实现了场景空间中无穿透且精确的穿衣交互多人重建。结果表明,与现有方法相比,我们的方法具有完整性、全局连贯性和物理合理性。