可微分逆图形学在零样本场景重建与机器人抓取中的应用 (Differentiable Inverse Graphics for Zero-shot Scene Reconstruction and Robot Grasping) - 专知论文

会员服务 ·

0

样本 · 重建 · 零样本 · 场景重建 · 图形学 ·

Differentiable Inverse Graphics for Zero-shot Scene Reconstruction and Robot Grasping

翻译：可微分逆图形学在零样本场景重建与机器人抓取中的应用

Octavio Arriaga,Proneet Sharma,Jichen Guo,Marc Otto,Siddhant Kadwe,Rebecca Adam

from arxiv, Submitted to IEEE Robotics and Automation Letters (RA-L) for review. This version includes the statement required by IEEE for preprints

Operating effectively in novel real-world environments requires robotic systems to estimate and interact with previously unseen objects. Current state-of-the-art models address this challenge by using large amounts of training data and test-time samples to build black-box scene representations. In this work, we introduce a differentiable neuro-graphics model that combines neural foundation models with physics-based differentiable rendering to perform zero-shot scene reconstruction and robot grasping without relying on any additional 3D data or test-time samples. Our model solves a series of constrained optimization problems to estimate physically consistent scene parameters, such as meshes, lighting conditions, material properties, and 6D poses of previously unseen objects from a single RGBD image and bounding boxes. We evaluated our approach on standard model-free few-shot benchmarks and demonstrated that it outperforms existing algorithms for model-free few-shot pose estimation. Furthermore, we validated the accuracy of our scene reconstructions by applying our algorithm to a zero-shot grasping task. By enabling zero-shot, physically-consistent scene reconstruction and grasping without reliance on extensive datasets or test-time sampling, our approach offers a pathway towards more data efficient, interpretable and generalizable robot autonomy in novel environments.

翻译：在未知的真实环境中有效操作，要求机器人系统能够对未见过的物体进行估计与交互。当前最先进的模型通过使用大量训练数据和测试时样本来构建黑箱场景表示以应对这一挑战。本文提出一种可微分神经图形学模型，该模型将神经基础模型与基于物理的可微分渲染相结合，无需依赖任何额外的三维数据或测试时样本，即可实现零样本场景重建与机器人抓取。我们的模型通过求解一系列约束优化问题，从单张RGBD图像和边界框出发，估计物理一致的场景参数，包括未见物体的网格、光照条件、材质属性及六维位姿。我们在标准的无模型少样本基准测试上评估了本方法，结果表明其在无模型少样本位姿估计任务中优于现有算法。此外，我们通过将算法应用于零样本抓取任务，验证了场景重建的准确性。本方法实现了不依赖大规模数据集或测试时采样的零样本物理一致场景重建与抓取，为在未知环境中实现更高数据效率、更强可解释性与泛化能力的机器人自主性提供了一条可行路径。

0

相关内容

【图宾根大学博士论文】神经场景表示在三维重建和生成建模中的应用

【图宾根大学博士论文】神经场景表示在三维重建和生成建模中的应用

专知会员服务

40+阅读 · 2023年12月2日

【柏林工业大学博士论文】可解释结构化机器学习:对相似性、图和Transformer模型的洞察，143页pdf

【柏林工业大学博士论文】可解释结构化机器学习:对相似性、图和Transformer模型的洞察，143页pdf

专知会员服务

46+阅读 · 2023年2月28日

多模态图学习怎么用？哈佛最新《几何多模态表示学习》综述，28页pdf阐述多模态图学习在图像、语言、自然科学等应用

多模态图学习怎么用？哈佛最新《几何多模态表示学习》综述，28页pdf阐述多模态图学习在图像、语言、自然科学等应用

专知会员服务

118+阅读 · 2022年9月8日

【哥伦比亚大学】复杂网络深度表示的几何和拓扑推理，Geometric and Topological Inference for Deep Representations of Complex Networks

【哥伦比亚大学】复杂网络深度表示的几何和拓扑推理，Geometric and Topological Inference for Deep Representations of Complex Networks

专知会员服务

22+阅读 · 2022年3月11日

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

专知会员服务

24+阅读 · 2022年3月10日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

专知会员服务

52+阅读 · 2019年12月16日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Deep Learning for Graphs: Models and Applications，密歇根州立大学唐继良助理教授，CIPS ATT 16（2019）

Deep Learning for Graphs: Models and Applications，密歇根州立大学唐继良助理教授，CIPS ATT 16（2019）

专知会员服务

54+阅读 · 2019年10月25日

【2022新书】机器学习的实用模拟与合成，428页pdf

【2022新书】机器学习的实用模拟与合成，428页pdf

专知

18+阅读 · 2022年8月10日

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

专知

27+阅读 · 2021年3月7日

3D Human相关研究总结：人体、姿态估计、人体重建等

3D Human相关研究总结：人体、姿态估计、人体重建等

PaperWeekly

27+阅读 · 2021年3月1日

零样本图像识别综述论文

零样本图像识别综述论文

专知

22+阅读 · 2020年4月4日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知

76+阅读 · 2020年1月27日

通俗易懂！《图机器学习导论》附69页PPT

通俗易懂！《图机器学习导论》附69页PPT

专知

55+阅读 · 2019年12月27日

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

专知

77+阅读 · 2019年10月20日

【干货】Python机器学习机器学习项目实战3——模型解释与结果分析（附代码）

【干货】Python机器学习机器学习项目实战3——模型解释与结果分析（附代码）

专知

16+阅读 · 2018年5月24日

【学界】李飞飞学生最新论文：利用场景图生成图像

【学界】李飞飞学生最新论文：利用场景图生成图像

GAN生成式对抗网络

15+阅读 · 2018年4月9日

孪生网络实现小数据学习！看神经网络如何找出两张图片的相似点

孪生网络实现小数据学习！看神经网络如何找出两张图片的相似点

机器人圈

35+阅读 · 2017年7月18日

基于压缩感知理论的图像采样、编码和重建研究

国家自然科学基金

1+阅读 · 2015年12月31日

分布无关的概率图模型结构学习方法的研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于深度学习的多尺度本质图像提取方法

国家自然科学基金

5+阅读 · 2015年12月31日

基于极限学习单元的多生物特征图像深度学习建模与识别研究

国家自然科学基金

2+阅读 · 2015年12月31日

复杂场景下目标跟踪的表观建模研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于图像属性和深度学习的大规模物体检测研究与应用

国家自然科学基金

1+阅读 · 2015年12月31日

面向大数据的信息可视化设计方法研究

国家自然科学基金

7+阅读 · 2014年12月31日

基于深度学习的特征融合在移动机器人视觉中的场景理解及研究

国家自然科学基金

12+阅读 · 2014年12月31日

概率图模型学习及其在数据分析中的应用研究

国家自然科学基金

16+阅读 · 2013年12月31日

强化学习关键技术及其在机器人行为学习中的应用

国家自然科学基金

23+阅读 · 2009年12月31日

Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation

Arxiv

0+阅读 · 2月18日

ORACLE-Grasp: Zero-Shot Affordance-Aligned Robotic Grasping using Large Multimodal Models

Arxiv

0+阅读 · 2月15日

Simultaneous State Estimation and Online Model Learning in a Soft Robotic System

Arxiv

0+阅读 · 2月15日

Towards Learning a Generalizable 3D Scene Representation from 2D Observations

Arxiv

0+阅读 · 2月11日

MRD: Using Physically Based Differentiable Rendering to Probe Vision Models for 3D Scene Understanding

Arxiv

0+阅读 · 2月6日

Learning Geometrically-Grounded 3D Visual Representations for View-Generalizable Robotic Manipulation

Arxiv

0+阅读 · 1月30日

Graph Neural Networks, Deep Reinforcement Learning and Probabilistic Topic Modeling for Strategic Multiagent Settings

Arxiv

0+阅读 · 1月22日

Combining Shape Completion and Grasp Prediction for Fast and Versatile Grasping with a Multi-Fingered Hand

Arxiv

0+阅读 · 1月19日

Generalizable Geometric Prior and Recurrent Spiking Feature Learning for Humanoid Robot Manipulation

Arxiv

0+阅读 · 1月13日

Statistical Blendshape Calculation and Analysis for Graphics Applications

Arxiv

0+阅读 · 1月12日

VIP会员

文章信息

相关主题

相关VIP内容

【图宾根大学博士论文】神经场景表示在三维重建和生成建模中的应用

【图宾根大学博士论文】神经场景表示在三维重建和生成建模中的应用

专知会员服务

40+阅读 · 2023年12月2日

【柏林工业大学博士论文】可解释结构化机器学习:对相似性、图和Transformer模型的洞察，143页pdf

【柏林工业大学博士论文】可解释结构化机器学习:对相似性、图和Transformer模型的洞察，143页pdf

专知会员服务

46+阅读 · 2023年2月28日

多模态图学习怎么用？哈佛最新《几何多模态表示学习》综述，28页pdf阐述多模态图学习在图像、语言、自然科学等应用

多模态图学习怎么用？哈佛最新《几何多模态表示学习》综述，28页pdf阐述多模态图学习在图像、语言、自然科学等应用

专知会员服务

118+阅读 · 2022年9月8日

【哥伦比亚大学】复杂网络深度表示的几何和拓扑推理，Geometric and Topological Inference for Deep Representations of Complex Networks

【哥伦比亚大学】复杂网络深度表示的几何和拓扑推理，Geometric and Topological Inference for Deep Representations of Complex Networks

专知会员服务

22+阅读 · 2022年3月11日

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

专知会员服务

24+阅读 · 2022年3月10日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

专知会员服务

52+阅读 · 2019年12月16日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Deep Learning for Graphs: Models and Applications，密歇根州立大学唐继良助理教授，CIPS ATT 16（2019）

Deep Learning for Graphs: Models and Applications，密歇根州立大学唐继良助理教授，CIPS ATT 16（2019）

专知会员服务

54+阅读 · 2019年10月25日

热门VIP内容

开通专知VIP会员享更多权益服务

《可信人工智能赋能系统的支柱》

《从经典神经网络到不确定性下的拓扑神经网络：军事应用》2026最新40页报告

人工智能赋能边缘与自主系统：美陆军现代化进程聚焦威胁探测与战术边缘情报

《人工智能：对战略与力量的影响》slides

相关资讯

【2022新书】机器学习的实用模拟与合成，428页pdf

【2022新书】机器学习的实用模拟与合成，428页pdf

专知

18+阅读 · 2022年8月10日

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

专知

27+阅读 · 2021年3月7日

3D Human相关研究总结：人体、姿态估计、人体重建等

3D Human相关研究总结：人体、姿态估计、人体重建等

PaperWeekly

27+阅读 · 2021年3月1日

零样本图像识别综述论文

零样本图像识别综述论文

专知

22+阅读 · 2020年4月4日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知

76+阅读 · 2020年1月27日

通俗易懂！《图机器学习导论》附69页PPT

通俗易懂！《图机器学习导论》附69页PPT

专知

55+阅读 · 2019年12月27日

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

专知

77+阅读 · 2019年10月20日

【干货】Python机器学习机器学习项目实战3——模型解释与结果分析（附代码）

【干货】Python机器学习机器学习项目实战3——模型解释与结果分析（附代码）

专知

16+阅读 · 2018年5月24日

【学界】李飞飞学生最新论文：利用场景图生成图像

【学界】李飞飞学生最新论文：利用场景图生成图像

GAN生成式对抗网络

15+阅读 · 2018年4月9日

孪生网络实现小数据学习！看神经网络如何找出两张图片的相似点

孪生网络实现小数据学习！看神经网络如何找出两张图片的相似点

机器人圈

35+阅读 · 2017年7月18日

相关论文

Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation

Arxiv

0+阅读 · 2月18日

ORACLE-Grasp: Zero-Shot Affordance-Aligned Robotic Grasping using Large Multimodal Models

Arxiv

0+阅读 · 2月15日

Simultaneous State Estimation and Online Model Learning in a Soft Robotic System

Arxiv

0+阅读 · 2月15日

Towards Learning a Generalizable 3D Scene Representation from 2D Observations

Arxiv

0+阅读 · 2月11日

MRD: Using Physically Based Differentiable Rendering to Probe Vision Models for 3D Scene Understanding

Arxiv

0+阅读 · 2月6日

Learning Geometrically-Grounded 3D Visual Representations for View-Generalizable Robotic Manipulation

Arxiv

0+阅读 · 1月30日

Graph Neural Networks, Deep Reinforcement Learning and Probabilistic Topic Modeling for Strategic Multiagent Settings

Arxiv

0+阅读 · 1月22日

Combining Shape Completion and Grasp Prediction for Fast and Versatile Grasping with a Multi-Fingered Hand

Arxiv

0+阅读 · 1月19日

Generalizable Geometric Prior and Recurrent Spiking Feature Learning for Humanoid Robot Manipulation

Arxiv

0+阅读 · 1月13日

Statistical Blendshape Calculation and Analysis for Graphics Applications

Arxiv

0+阅读 · 1月12日

相关基金

基于压缩感知理论的图像采样、编码和重建研究

国家自然科学基金

1+阅读 · 2015年12月31日

分布无关的概率图模型结构学习方法的研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于深度学习的多尺度本质图像提取方法

国家自然科学基金

5+阅读 · 2015年12月31日

基于极限学习单元的多生物特征图像深度学习建模与识别研究

国家自然科学基金

2+阅读 · 2015年12月31日

复杂场景下目标跟踪的表观建模研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于图像属性和深度学习的大规模物体检测研究与应用

国家自然科学基金

1+阅读 · 2015年12月31日

面向大数据的信息可视化设计方法研究

国家自然科学基金

7+阅读 · 2014年12月31日

基于深度学习的特征融合在移动机器人视觉中的场景理解及研究

国家自然科学基金

12+阅读 · 2014年12月31日

概率图模型学习及其在数据分析中的应用研究

国家自然科学基金

16+阅读 · 2013年12月31日

强化学习关键技术及其在机器人行为学习中的应用

国家自然科学基金

23+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员