This paper introduces a novel pipeline to reconstruct the geometry of interacting multi-person in clothing on a globally coherent scene space from a single image. The main challenge arises from the occlusion: a part of a human body is not visible from a single view due to the occlusion by others or the self, which introduces missing geometry and physical implausibility (e.g., penetration). We overcome this challenge by utilizing two human priors for complete 3D geometry and surface contacts. For the geometry prior, an encoder learns to regress the image of a person with missing body parts to the latent vectors; a decoder decodes these vectors to produce 3D features of the associated geometry; and an implicit network combines these features with a surface normal map to reconstruct a complete and detailed 3D humans. For the contact prior, we develop an image-space contact detector that outputs a probability distribution of surface contacts between people in 3D. We use these priors to globally refine the body poses, enabling the penetration-free and accurate reconstruction of interacting multi-person in clothing on the scene space. The results demonstrate that our method is complete, globally coherent, and physically plausible compared to existing methods.
翻译:本文提出了一种新颖的流水线,用于从单张图像在全局一致的场景空间中重建穿着衣物的多人交互几何结构。主要挑战源于遮挡:由于他人或自身遮挡,单视角下人体部分区域不可见,导致几何缺失和物理不合理性(如穿透)。我们通过利用两个人体先验(完整三维几何先验和表面接触先验)克服了这一挑战。对于几何先验,编码器学习将存在身体部位缺失的人体图像回归到潜在向量;解码器将这些向量解码为关联几何的三维特征;隐式网络将这些特征与表面法线图相结合,重建完整且细节丰富的三维人体。对于接触先验,我们开发了一个图像空间接触检测器,输出人体之间三维表面接触的概率分布。利用这些先验对全身姿态进行全局优化,实现了场景空间中无穿透且精确的穿着衣物多人交互重建。实验结果表明,与现有方法相比,我们的方法具有完整性、全局一致性和物理合理性。