Human-designed visual manuals are crucial components in shape assembly activities. They provide step-by-step guidance on how we should move and connect different parts in a convenient and physically-realizable way. While there has been an ongoing effort in building agents that perform assembly tasks, the information in human-design manuals has been largely overlooked. We identify that this is due to 1) a lack of realistic 3D assembly objects that have paired manuals and 2) the difficulty of extracting structured information from purely image-based manuals. Motivated by this observation, we present IKEA-Manual, a dataset consisting of 102 IKEA objects paired with assembly manuals. We provide fine-grained annotations on the IKEA objects and assembly manuals, including decomposed assembly parts, assembly plans, manual segmentation, and 2D-3D correspondence between 3D parts and visual manuals. We illustrate the broad application of our dataset on four tasks related to shape assembly: assembly plan generation, part segmentation, pose estimation, and 3D part assembly.
翻译:人类设计的视觉手册是形状组装活动中的关键组成部分。它们以方便且物理上可实现的方式,逐步指导我们如何移动和连接不同部件。尽管构建执行组装任务的智能体的研究持续进行,但人类设计手册中的信息在很大程度上被忽视了。我们指出这是由于以下两个原因:1)缺乏具有配对手册的真实3D组装对象;2)从纯图像手册中提取结构化信息的难度。基于这一观察,我们提出了IKEA-Manual数据集,该数据集包含102个与组装手册配对的IKEA对象。我们对IKEA对象和组装手册提供了细粒度的标注,包括分解的组装部件、组装计划、手册分割以及3D部件与视觉手册之间的2D-3D对应关系。我们展示了该数据集在四个与形状组装相关的任务中的广泛应用:组装计划生成、部件分割、姿态估计和3D部件组装。