Open-Vocabulary Affordance Detection in 3D Point Clouds

Affordance detection is a challenging problem with a wide variety of robotic applications. Traditional affordance detection methods are limited to a predefined set of affordance labels, hence potentially restricting the adaptability of intelligent robots in complex and dynamic environments. In this paper, we present the Open-Vocabulary Affordance Detection (OpenAD) method, which is capable of detecting an unbounded number of affordances in 3D point clouds. By simultaneously learning the affordance text and the point feature, OpenAD successfully exploits the semantic relationships between affordances. Therefore, our proposed method enables zero-shot detection and can be able to detect previously unseen affordances without a single annotation example. Intensive experimental results show that OpenAD works effectively on a wide range of affordance detection setups and outperforms other baselines by a large margin. Additionally, we demonstrate the practicality of the proposed OpenAD in real-world robotic applications with a fast inference speed (~100ms). Our project is available at https://openad2023.github.io.

翻译：可操作属性检测是一个具有挑战性的问题，在机器人领域有着广泛的应用。传统的可操作属性检测方法仅限于预定义的属性标签集，这可能会限制智能机器人在复杂动态环境中的适应性。本文提出了一种开放词汇可操作属性检测（OpenAD）方法，该方法能够在三维点云中检测无限数量的可操作属性。通过同时学习属性文本和点云特征，OpenAD成功地利用了属性之间的语义关系。因此，我们提出的方法实现了零样本检测，能够在没有任何标注样本的情况下检测先前未见过的可操作属性。大量实验结果表明，OpenAD在多种可操作属性检测设置中均能有效工作，并以较大优势超越了其他基线方法。此外，我们通过快速的推理速度（约100毫秒）展示了所提出的OpenAD在真实机器人应用中的实用性。我们的项目可在https://openad2023.github.io获取。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日