Generalizable Human Gaussians from Single-View Image

In this work, we tackle the task of learning 3D human Gaussians from a single image, focusing on recovering detailed appearance and geometry including unobserved regions. We introduce a single-view generalizable Human Gaussian Model (HGM), which employs a novel generate-then-refine pipeline with the guidance from human body prior and diffusion prior. Our approach uses a ControlNet to refine rendered back-view images from coarse predicted human Gaussians, then uses the refined image along with the input image to reconstruct refined human Gaussians. To mitigate the potential generation of unrealistic human poses and shapes, we incorporate human priors from the SMPL-X model as a dual branch, propagating image features from the SMPL-X volume to the image Gaussians using sparse convolution and attention mechanisms. Given that the initial SMPL-X estimation might be inaccurate, we gradually refine it with our HGM model. We validate our approach on several publicly available datasets. Our method surpasses previous methods in both novel view synthesis and surface reconstruction. Our approach also exhibits strong generalization for cross-dataset evaluation and in-the-wild images.

翻译：本研究致力于从单视图图像学习三维人体高斯表示，重点恢复包括未观测区域在内的精细外观与几何特征。我们提出了一种单视图可泛化人体高斯模型（HGM），该模型采用创新的"生成-优化"流程，并融合人体先验与扩散先验作为指导。我们的方法首先利用ControlNet对粗粒度预测的人体高斯模型渲染的背视图进行优化，随后结合优化后的图像与输入图像重建精细化的人体高斯表示。为规避可能生成非真实人体姿态与形状的问题，我们引入SMPL-X模型的人体先验作为双分支架构，通过稀疏卷积与注意力机制将SMPL-X体素特征传播至图像高斯表示。考虑到初始SMPL-X估计可能存在误差，我们采用HGM模型对其进行渐进式优化。我们在多个公开数据集上验证了方法的有效性。实验表明，本方法在新视角合成与表面重建任务上均超越现有方法，同时在跨数据集评估与真实场景图像中展现出优异的泛化能力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日