Object affordance is an important concept in hand-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot imitation learning. However, the definition of affordance in existing datasets often: 1) mix up affordance with object functionality; 2) confuse affordance with goal-related action; and 3) ignore human motor capacity. This paper proposes an efficient annotation scheme to address these issues by combining goal-irrelevant motor actions and grasp types as affordance labels and introducing the concept of mechanical action to represent the action possibilities between two objects. We provide new annotations by applying this scheme to the EPIC-KITCHENS dataset and test our annotation with tasks such as affordance recognition, hand-object interaction hotspots prediction, and cross-domain evaluation of affordance. The results show that models trained with our annotation can distinguish affordance from other concepts, predict fine-grained interaction possibilities on objects, and generalize through different domains.
翻译:物体可供性是手物交互中的一个重要概念,它基于人类运动能力和物体的物理属性提供动作可能性的信息,从而有益于动作预测和机器人模仿学习等任务。然而,现有数据集中对可供性的定义常存在以下问题:1)将可供性与物体功能混淆;2)将可供性与目标相关的动作混淆;3)忽视人类运动能力。本文提出了一种高效的标注方案,通过将目标无关的运动动作和抓取类型作为可供性标签,并引入机械动作的概念来表示两个物体之间的动作可能性,从而解决这些问题。我们将该方案应用于EPIC-KITCHENS数据集,生成了新的标注,并在可供性识别、手物交互热点预测以及跨领域可供性评估等任务上测试了我们的标注。结果表明,使用我们的标注训练的模型能够区分可供性与其他概念,预测物体上的细粒度交互可能性,并在不同领域之间实现泛化。