Affordance detection is a challenging problem with a wide variety of robotic applications. Traditional affordance detection methods are limited to a predefined set of affordance labels, hence potentially restricting the adaptability of intelligent robots in complex and dynamic environments. In this paper, we present the Open-Vocabulary Affordance Detection (OpenAD) method, which is capable of detecting an unbounded number of affordances in 3D point clouds. By simultaneously learning the affordance text and the point feature, OpenAD successfully exploits the semantic relationships between affordances. Therefore, our proposed method enables zero-shot detection and can be able to detect previously unseen affordances without a single annotation example. Intensive experimental results show that OpenAD works effectively on a wide range of affordance detection setups and outperforms other baselines by a large margin. Additionally, we demonstrate the practicality of the proposed OpenAD in real-world robotic applications with a fast inference speed (~100ms). Our project is available at https://openad2023.github.io.
翻译:可操作性检测是一个具有广泛机器人应用前景的挑战性问题。传统的可操作性检测方法局限于预定义的可操作性标签集,从而可能限制智能机器人在复杂动态环境中的适应性。本文提出开放词汇可操作性检测(OpenAD)方法,该方法能够检测3D点云中无限数量的可操作性。通过同时学习可操作性文本与点特征,OpenAD成功利用了可操作性之间的语义关联。因此,我们的方法支持零样本检测,无需任何标注示例即可检测先前未见过的新型可操作性。大量实验结果表明,OpenAD在多种可操作性检测场景下均表现优异,并以较大优势超越其他基线方法。此外,我们验证了所提OpenAD在真实机器人应用中的实用性,其推理速度可达约100毫秒。项目代码已开源至https://openad2023.github.io。