Dressed people reconstruction from images is a popular task with promising applications in the creative media and game industry. However, most existing methods reconstruct the human body and garments as a whole with the supervision of 3D models, which hinders the downstream interaction tasks and requires hard-to-obtain data. To address these issues, we propose an unsupervised separated 3D garments and human reconstruction model (USR), which reconstructs the human body and authentic textured clothes in layers without 3D models. More specifically, our method proposes a generalized surface-aware neural radiance field to learn the mapping between sparse multi-view images and geometries of the dressed people. Based on the full geometry, we introduce a Semantic and Confidence Guided Separation strategy (SCGS) to detect, segment, and reconstruct the clothes layer, leveraging the consistency between 2D semantic and 3D geometry. Moreover, we propose a Geometry Fine-tune Module to smooth edges. Extensive experiments on our dataset show that comparing with state-of-the-art methods, USR achieves improvements on both geometry and appearance reconstruction while supporting generalizing to unseen people in real time. Besides, we also introduce SMPL-D model to show the benefit of the separated modeling of clothes and the human body that allows swapping clothes and virtual try-on.
翻译:从图像中重建着装人体是一项具有广泛应用前景的热门任务,尤其在创意媒体和游戏行业。然而,现有方法大多在三维模型监督下将人体与衣物作为整体进行重建,这不仅阻碍了下游交互任务,还依赖于难以获取的数据。为解决这些问题,我们提出了一种无监督的三维衣物与人体分离重建模型(USR),该模型无需三维模型即可分层重建人体与逼真纹理衣物。具体而言,我们的方法提出了一种广义曲面感知神经辐射场,用于学习稀疏多视角图像与着装人体几何之间的映射。在完整几何基础上,我们引入了语义与置信度引导的分离策略(SCGS),利用二维语义与三维几何之间的一致性,对衣物层进行检测、分割与重建。进一步,我们提出了几何微调模块以平滑边缘。在我们数据集上的大量实验表明,与现有最优方法相比,USR在几何与外观重建上均取得了提升,同时支持实时泛化至未见人物。此外,我们还引入了SMPL-D模型,以展示衣物与人体分离建模的优势,从而实现衣物互换与虚拟试穿。