Affordance detection is a challenging problem with a wide variety of robotic applications. Traditional affordance detection methods are limited to a predefined set of affordance labels, hence potentially restricting the adaptability of intelligent robots in complex and dynamic environments. In this paper, we present the Open-Vocabulary Affordance Detection (OpenAD) method, which is capable of detecting an unbounded number of affordances in 3D point clouds. By simultaneously learning the affordance text and the point feature, OpenAD successfully exploits the semantic relationships between affordances. Therefore, our proposed method enables zero-shot detection and can detect previously unseen affordances without a single annotation example. Intensive experimental results show that OpenAD works effectively on a wide range of affordance detection setups and outperforms other baselines by a large margin. Additionally, we demonstrate the practicality of the proposed OpenAD in real-world robotic applications with a fast inference speed (~100 ms).
翻译:功能检测是一个具有广泛应用价值的机器人学难题。传统方法受限于预定义的功能标签集,这限制了智能机器人在复杂动态环境中的适应性。本文提出开放词汇功能检测方法(Open-Vocabulary Affordance Detection,简称OpenAD),该方法能够在三维点云中检测无限种功能。通过同步学习功能文本与点云特征,OpenAD成功利用了功能之间的语义关联。因此,所提出的方法实现了零样本检测能力,无需任何标注示例即可检测未见过的功能。大量实验结果表明,OpenAD在多种功能检测场景下均能有效工作,并以显著优势超越其他基线方法。此外,我们展示了OpenAD在实际机器人应用中的实用性(推理速度约100毫秒)。