SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes

We present SCULPT, a novel 3D generative model for clothed and textured 3D meshes of humans. Specifically, we devise a deep neural network that learns to represent the geometry and appearance distribution of clothed human bodies. Training such a model is challenging, as datasets of textured 3D meshes for humans are limited in size and accessibility. Our key observation is that there exist medium-sized 3D scan datasets like CAPE, as well as large-scale 2D image datasets of clothed humans and multiple appearances can be mapped to a single geometry. To effectively learn from the two data modalities, we propose an unpaired learning procedure for pose-dependent clothed and textured human meshes. Specifically, we learn a pose-dependent geometry space from 3D scan data. We represent this as per vertex displacements w.r.t. the SMPL model. Next, we train a geometry conditioned texture generator in an unsupervised way using the 2D image data. We use intermediate activations of the learned geometry model to condition our texture generator. To alleviate entanglement between pose and clothing type, and pose and clothing appearance, we condition both the texture and geometry generators with attribute labels such as clothing types for the geometry, and clothing colors for the texture generator. We automatically generated these conditioning labels for the 2D images based on the visual question answering model BLIP and CLIP. We validate our method on the SCULPT dataset, and compare to state-of-the-art 3D generative models for clothed human bodies. Our code and data can be found at https://sculpt.is.tue.mpg.de.

翻译：我们提出SCULPT，一种面向穿衣人体带纹理三维网格的新型三维生成模型。具体而言，我们设计了一种深度神经网络，用于学习穿衣人体的几何与外观分布表征。此类模型的训练极具挑战性，因为人体带纹理三维网格数据集在规模和可获取性方面均十分有限。我们的关键发现是：存在中等规模的CAPE等三维扫描数据集，以及大规模穿衣人体二维图像数据集，且多种外观可映射至单一几何结构。为有效利用这两种数据模态，我们提出一种针对姿态相关穿衣人体带纹理网格的非配对学习流程。具体而言，我们从三维扫描数据中学习姿态相关的几何空间，并将其表示为相对于SMPL模型的逐顶点位移量。随后，我们利用二维图像数据以无监督方式训练几何条件纹理生成器，并利用所学几何模型的中间激活层作为纹理生成器的条件输入。为缓解姿态与服装类型、姿态与服装外观之间的耦合效应，我们使用属性标签对纹理生成器和几何生成器进行条件约束——几何生成器以服装类型为条件，纹理生成器以服装颜色为条件。这些二维图像的条件标签基于视觉问答模型BLIP和CLIP自动生成。我们在SCULPT数据集上验证了该方法，并与当前最先进的穿衣人体三维生成模型进行了对比。我们的代码与数据可通过https://sculpt.is.tue.mpg.de获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日