One-shot Implicit Animatable Avatars with Model-based Priors - 专知论文

会员服务 ·

0

3D · MoDELS · Learning · Performer · HTTPS ·

2023 年 3 月 16 日

One-shot Implicit Animatable Avatars with Model-based Priors

翻译：单目隐式可动画化人体模型：基于模型先验的一次性重建方法

Yangyi Huang,Hongwei Yi,Weiyang Liu,Haofan Wang,Boxi Wu,Wenxiao Wang,Binbin Lin,Debing Zhang,Deng Cai

from arxiv, Project website: https://elicit3d.github.io

Existing neural rendering methods for creating human avatars typically either require dense input signals such as video or multi-view images, or leverage a learned prior from large-scale specific 3D human datasets such that reconstruction can be performed with sparse-view inputs. Most of these methods fail to achieve realistic reconstruction when only a single image is available. To enable the data-efficient creation of realistic animatable 3D humans, we propose ELICIT, a novel method for learning human-specific neural radiance fields from a single image. Inspired by the fact that humans can effortlessly estimate the body geometry and imagine full-body clothing from a single image, we leverage two priors in ELICIT: 3D geometry prior and visual semantic prior. Specifically, ELICIT utilizes the 3D body shape geometry prior from a skinned vertex-based template model (i.e., SMPL) and implements the visual clothing semantic prior with the CLIP-based pre-trained models. Both priors are used to jointly guide the optimization for creating plausible content in the invisible areas. Taking advantage of the CLIP models, ELICIT can use text descriptions to generate text-conditioned unseen regions. In order to further improve visual details, we propose a segmentation-based sampling strategy that locally refines different parts of the avatar. Comprehensive evaluations on multiple popular benchmarks, including ZJU-MoCAP, Human3.6M, and DeepFashion, show that ELICIT has outperformed strong baseline methods of avatar creation when only a single image is available. The code is public for research purposes at https://elicit3d.github.io/

翻译：现有用于创建人体化身神经渲染方法通常需要密集输入信号（如视频或多视角图像），或利用大规模特定3D人体数据集的学习先验实现稀疏输入重建。大多数方法在仅提供单张图像时无法实现逼真重建。为实现高数据效率的逼真可动画化3D人体创建，我们提出ELICIT——一种从单张图像学习人体特定神经辐射场的新方法。受人类能轻易从单张图像估计身体几何并想象全身衣着的启发，ELICIT利用两种先验：3D几何先验与视觉语义先验。具体而言，ELICIT采用基于蒙皮顶点模板模型（SMPL）的3D体型几何先验，并通过CLIP预训练模型实现视觉衣物语义先验。两种先验联合指导优化，以生成不可见区域中合理的内容。借助CLIP模型，ELICIT可结合文本描述生成文本条件驱动的不可见区域。为进一步提升视觉细节，我们提出基于分割的采样策略，对化身不同部位进行局部优化。在ZJU-MoCAP、Human3.6M和DeepFashion等多个主流基准上的综合评估表明，仅需单张图像时，ELICIT已超越强基线化身创建方法。研究用代码已开源至https://elicit3d.github.io/

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

【CVPR2023】基于文本驱动软掩码的多模态表示学习

【CVPR2023】基于文本驱动软掩码的多模态表示学习

专知会员服务

21+阅读 · 2023年4月10日

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

专知会员服务

29+阅读 · 2022年3月6日

【CVPR 2022】从大量非正式视频中构建可动画的3D神经模型，BANMo: Building Animatable 3D Neural Models from Many Casual Videos

【CVPR 2022】从大量非正式视频中构建可动画的3D神经模型，BANMo: Building Animatable 3D Neural Models from Many Casual Videos

专知会员服务

25+阅读 · 2022年3月3日

【CVPR 2022】单目3D语义场景完成框架，MonoScene: Monocular 3D Semantic Scene Completion

【CVPR 2022】单目3D语义场景完成框架，MonoScene: Monocular 3D Semantic Scene Completion

专知会员服务

16+阅读 · 2022年3月3日

搞AR/VR元宇宙实战书！【Manning2022新书】Unity 实战，418页pdf

搞AR/VR元宇宙实战书！【Manning2022新书】Unity 实战，418页pdf

专知会员服务

95+阅读 · 2022年1月10日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

专知会员服务

48+阅读 · 2020年4月13日

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

专知会员服务

17+阅读 · 2020年3月21日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

英伟达肖像动画新模型SPACEx发布，三步就让照片里的人「活」过来！

英伟达肖像动画新模型SPACEx发布，三步就让照片里的人「活」过来！

新智元

0+阅读 · 2022年11月29日

7 Papers & Radios | ECCV 2022最佳论文；Transformer在试错中自主改进

7 Papers & Radios | ECCV 2022最佳论文；Transformer在试错中自主改进

机器之心

0+阅读 · 2022年10月30日

ECCV2022 Oral｜无需前置条件的自动着色算法

ECCV2022 Oral｜无需前置条件的自动着色算法

极市平台

0+阅读 · 2022年7月16日

CVPR 2022 | ClonedPerson：从单照片构建大规模真实穿搭虚拟行人数据集

CVPR 2022 | ClonedPerson：从单照片构建大规模真实穿搭虚拟行人数据集

机器之心

0+阅读 · 2022年7月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【泡泡一分钟】学习行人如何导航：一种深度逆强化学习的方法

【泡泡一分钟】学习行人如何导航：一种深度逆强化学习的方法

泡泡机器人SLAM

20+阅读 · 2019年4月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

基于距离图像局部特征的三维形变目标识别技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于片变换统计学习的图像修复方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于四元数的彩色视频去噪方法

国家自然科学基金

0+阅读 · 2012年12月31日

三维视频视觉质量增强关键理论与方法

国家自然科学基金

0+阅读 · 2012年12月31日

融合视觉特性的基于深度图像自由视点绘制技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

融合多视觉对象的行为分析与语义描述

国家自然科学基金

1+阅读 · 2012年12月31日

基于先验知识的三维点云鲁棒处理技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

医学图像分割中面向目标的形状统计与边界特征学习

国家自然科学基金

3+阅读 · 2011年12月31日

基于道路智能空间的车辆主动避障局部路径规划研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

AvatarReX: Real-time Expressive Full-body Avatars

Arxiv

0+阅读 · 2023年5月8日

Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation

Arxiv

0+阅读 · 2023年5月8日

Diffusion-SDF: Text-to-Shape via Voxelized Diffusion

Arxiv

0+阅读 · 2023年5月7日

Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization

Arxiv

0+阅读 · 2023年5月4日

Interpreting Vision and Language Generative Models with Semantic Visual Priors

Arxiv

0+阅读 · 2023年5月4日

Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos

Arxiv

0+阅读 · 2023年5月4日

High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

Arxiv

0+阅读 · 2023年5月4日

FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

Arxiv

0+阅读 · 2023年5月4日

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Arxiv

43+阅读 · 2023年4月19日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

VIP会员

文章信息

相关主题

最新内容

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

3+阅读 · 今天8:04

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

3+阅读 · 今天7:59

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

3+阅读 · 今天7:56

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

3+阅读 · 今天7:50

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

3+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

4+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

11+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

7+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

6+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

4+阅读 · 7月27日

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

10+阅读 · 7月27日

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

6+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

9+阅读 · 7月26日

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

专知会员服务

8+阅读 · 7月26日

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

专知会员服务

11+阅读 · 7月26日

相关VIP内容

【CVPR2023】基于文本驱动软掩码的多模态表示学习

【CVPR2023】基于文本驱动软掩码的多模态表示学习

专知会员服务

21+阅读 · 2023年4月10日

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

专知会员服务

29+阅读 · 2022年3月6日

【CVPR 2022】从大量非正式视频中构建可动画的3D神经模型，BANMo: Building Animatable 3D Neural Models from Many Casual Videos

【CVPR 2022】从大量非正式视频中构建可动画的3D神经模型，BANMo: Building Animatable 3D Neural Models from Many Casual Videos

专知会员服务

25+阅读 · 2022年3月3日

【CVPR 2022】单目3D语义场景完成框架，MonoScene: Monocular 3D Semantic Scene Completion

【CVPR 2022】单目3D语义场景完成框架，MonoScene: Monocular 3D Semantic Scene Completion

专知会员服务

16+阅读 · 2022年3月3日

搞AR/VR元宇宙实战书！【Manning2022新书】Unity 实战，418页pdf

搞AR/VR元宇宙实战书！【Manning2022新书】Unity 实战，418页pdf

专知会员服务

95+阅读 · 2022年1月10日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

专知会员服务

48+阅读 · 2020年4月13日

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

专知会员服务

17+阅读 · 2020年3月21日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《异构人类团队的协作决策过程混合建模研究》

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

面向国防作战的最佳自主与蜂群无人机技术

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

相关资讯

英伟达肖像动画新模型SPACEx发布，三步就让照片里的人「活」过来！

英伟达肖像动画新模型SPACEx发布，三步就让照片里的人「活」过来！

新智元

0+阅读 · 2022年11月29日

7 Papers & Radios | ECCV 2022最佳论文；Transformer在试错中自主改进

7 Papers & Radios | ECCV 2022最佳论文；Transformer在试错中自主改进

机器之心

0+阅读 · 2022年10月30日

ECCV2022 Oral｜无需前置条件的自动着色算法

ECCV2022 Oral｜无需前置条件的自动着色算法

极市平台

0+阅读 · 2022年7月16日

CVPR 2022 | ClonedPerson：从单照片构建大规模真实穿搭虚拟行人数据集

CVPR 2022 | ClonedPerson：从单照片构建大规模真实穿搭虚拟行人数据集

机器之心

0+阅读 · 2022年7月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【泡泡一分钟】学习行人如何导航：一种深度逆强化学习的方法

【泡泡一分钟】学习行人如何导航：一种深度逆强化学习的方法

泡泡机器人SLAM

20+阅读 · 2019年4月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

AvatarReX: Real-time Expressive Full-body Avatars

Arxiv

0+阅读 · 2023年5月8日

Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation

Arxiv

0+阅读 · 2023年5月8日

Diffusion-SDF: Text-to-Shape via Voxelized Diffusion

Arxiv

0+阅读 · 2023年5月7日

Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization

Arxiv

0+阅读 · 2023年5月4日

Interpreting Vision and Language Generative Models with Semantic Visual Priors

Arxiv

0+阅读 · 2023年5月4日

Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos

Arxiv

0+阅读 · 2023年5月4日

High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

Arxiv

0+阅读 · 2023年5月4日

FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

Arxiv

0+阅读 · 2023年5月4日

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Arxiv

43+阅读 · 2023年4月19日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

相关基金

基于距离图像局部特征的三维形变目标识别技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于片变换统计学习的图像修复方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于四元数的彩色视频去噪方法

国家自然科学基金

0+阅读 · 2012年12月31日

三维视频视觉质量增强关键理论与方法

国家自然科学基金

0+阅读 · 2012年12月31日

融合视觉特性的基于深度图像自由视点绘制技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

融合多视觉对象的行为分析与语义描述

国家自然科学基金

1+阅读 · 2012年12月31日

基于先验知识的三维点云鲁棒处理技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

医学图像分割中面向目标的形状统计与边界特征学习

国家自然科学基金

3+阅读 · 2011年12月31日

基于道路智能空间的车辆主动避障局部路径规划研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员