DAGSM：基于GS增强网格的解耦式虚拟人生成 (DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh)

Text-driven avatar generation has gained significant attention owing to its convenience. However, existing methods typically model the human body with all garments as a single 3D model, limiting its usability, such as clothing replacement, and reducing user control over the generation process. To overcome the limitations above, we propose DAGSM, a novel pipeline that generates disentangled human bodies and garments from the given text prompts. Specifically, we model each part (e.g., body, upper/lower clothes) of the clothed human as one GS-enhanced mesh (GSM), which is a traditional mesh attached with 2D Gaussians to better handle complicated textures (e.g., woolen, translucent clothes) and produce realistic cloth animations. During the generation, we first create the unclothed body, followed by a sequence of individual cloth generation based on the body, where we introduce a semantic-based algorithm to achieve better human-cloth and garment-garment separation. To improve texture quality, we propose a view-consistent texture refinement module, including a cross-view attention mechanism for texture style consistency and an incident-angle-weighted denoising (IAW-DE) strategy to update the appearance. Extensive experiments have demonstrated that DAGSM generates high-quality disentangled avatars, supports clothing replacement and realistic animation, and outperforms the baselines in visual quality.

翻译：文本驱动的虚拟人生成因其便捷性而受到广泛关注。然而，现有方法通常将身着服装的人体整体建模为单一三维模型，这限制了其可用性（例如服装替换），并降低了用户对生成过程的控制。为克服上述局限，本文提出DAGSM——一种从给定文本提示生成解耦人体与服装的创新流程。具体而言，我们将着装人体的每个部分（如身体、上/下装）建模为一个GS增强网格（GSM），即在传统网格表面附着二维高斯分布以更好地处理复杂纹理（如羊毛材质、半透明衣物）并生成逼真的布料动画。在生成过程中，我们首先生成未着装的人体，随后基于该人体依次生成独立服装部件，并提出基于语义的算法以实现更优的人体-服装及服装部件间的分离效果。为提升纹理质量，我们设计了视图一致的纹理优化模块，包含用于保持纹理风格一致性的跨视图注意力机制，以及通过入射角加权去噪（IAW-DE）策略更新外观的方法。大量实验表明，DAGSM能生成高质量的解耦虚拟人，支持服装替换与逼真动画，并在视觉质量上超越基线方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日