Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images

Image-guided object assembly represents a burgeoning research topic in computer vision. This paper introduces a novel task: translating multi-view images of a structural 3D model (for example, one constructed with building blocks drawn from a 3D-object library) into a detailed sequence of assembly instructions executable by a robotic arm. Fed with multi-view images of the target 3D model for replication, the model designed for this task must address several sub-tasks, including recognizing individual components used in constructing the 3D model, estimating the geometric pose of each component, and deducing a feasible assembly order adhering to physical rules. Establishing accurate 2D-3D correspondence between multi-view images and 3D objects is technically challenging. To tackle this, we propose an end-to-end model known as the Neural Assembler. This model learns an object graph where each vertex represents recognized components from the images, and the edges specify the topology of the 3D model, enabling the derivation of an assembly plan. We establish benchmarks for this task and conduct comprehensive empirical evaluations of Neural Assembler and alternative solutions. Our experiments clearly demonstrate the superiority of Neural Assembler.

翻译：图像引导的物体组装是计算机视觉中一个新兴的研究课题。本文提出了一项新任务：将结构三维模型（例如，使用3D物体库中的积木构建的模型）的多视角图像翻译成可由机械臂执行的详细组装指令序列。给定目标3D模型的多视角图像进行复制，为此任务设计的模型必须解决若干子任务，包括识别构建3D模型所用的单个组件、估计每个组件的几何姿态，以及推导出遵循物理规则的可行组装顺序。在多视角图像与3D物体之间建立精确的2D-3D对应关系在技术上具有挑战性。为解决这一问题，我们提出了一种端到端模型，称为神经组装机（Neural Assembler）。该模型学习一个物体图，其中每个顶点代表从图像中识别出的组件，边指定3D模型的拓扑结构，从而能够推导出组装方案。我们为此任务建立了基准测试，并对神经组装机及其替代方案进行了全面的实验评估。实验清楚地证明了神经组装机的优越性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日