Articulated Object Manipulation with Coarse-to-fine Affordance for Mitigating the Effect of Point Cloud Noise

3D articulated objects are inherently challenging for manipulation due to the varied geometries and intricate functionalities associated with articulated objects.Point-level affordance, which predicts the per-point actionable score and thus proposes the best point to interact with, has demonstrated excellent performance and generalization capabilities in articulated object manipulation. However, a significant challenge remains: while previous works use perfect point cloud generated in simulation, the models cannot directly apply to the noisy point cloud in the real-world.To tackle this challenge, we leverage the property of real-world scanned point cloud that, the point cloud becomes less noisy when the camera is closer to the object. Therefore, we propose a novel coarse-to-fine affordance learning pipeline to mitigate the effect of point cloud noise in two stages. In the first stage, we learn the affordance on the noisy far point cloud which includes the whole object to propose the approximated place to manipulate. Then, we move the camera in front of the approximated place, scan a less noisy point cloud containing precise local geometries for manipulation, and learn affordance on such point cloud to propose fine-grained final actions. The proposed method is thoroughly evaluated both using large-scale simulated noisy point clouds mimicking real-world scans, and in the real world scenarios, with superiority over existing methods, demonstrating the effectiveness in tackling the noisy real-world point cloud problem.

翻译：三维铰接物体因其多样化的几何结构和复杂的功能特性，在操作中天然具有挑战性。点级可操作度通过预测每个点的可操作得分，从而提出最佳交互点，已在铰接物体操作中展现出卓越的性能和泛化能力。然而，一个关键问题依然存在：以往研究使用仿真中生成的无噪声点云，而模型无法直接应用于现实世界中带有噪声的点云。为解决这一挑战，我们利用真实世界扫描点云的特性：当相机靠近物体时，点云噪声会降低。因此，我们提出了一种新颖的粗到细可操作度学习流程，通过两个阶段减轻点云噪声的影响。在第一阶段，我们在包含整个物体的远距离噪声点云上学习可操作度，以提出大致的操作位置。随后，将相机移至该大致位置前方，扫描包含精确局部几何结构的低噪声点云用于操作，并在该点云上学习可操作度以提出精细的最终动作。该方法在大规模模拟真实扫描的噪声点云以及真实世界场景中均进行了全面评估，结果优于现有方法，证明了其在解决真实世界噪声点云问题上的有效性。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日