Object affordance is an important concept in hand-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot imitation learning. However, the definition of affordance in existing datasets often: 1) mix up affordance with object functionality; 2) confuse affordance with goal-related action; and 3) ignore human motor capacity. This paper proposes an efficient annotation scheme to address these issues by combining goal-irrelevant motor actions and grasp types as affordance labels and introducing the concept of mechanical action to represent the action possibilities between two objects. We provide new annotations by applying this scheme to the EPIC-KITCHENS dataset and test our annotation with tasks such as affordance recognition, hand-object interaction hotspots prediction, and cross-domain evaluation of affordance. The results show that models trained with our annotation can distinguish affordance from other concepts, predict fine-grained interaction possibilities on objects, and generalize through different domains.
翻译:物体功能可供性是手-物交互中的重要概念,它基于人类运动能力与物体物理属性提供动作可能性信息,从而有益于动作预测和机器人模仿学习等任务。然而,现有数据集对功能可供性的定义往往存在以下问题:1)将功能可供性与物体功能性混为一谈;2)将功能可供性与目标相关动作混淆;3)忽视人类运动能力。本文提出一种高效的标注方案来解决这些问题,该方案通过将目标无关的运动动作和抓取类型作为功能可供性标签,并引入机械动作概念以表征两个物体之间的动作可能性。我们将该方案应用于EPIC-KITCHENS数据集进行新标注,并在功能可供性识别、手-物交互热点预测及跨领域功能可供性评估等任务中进行验证。结果表明,使用本标注训练的模型能够区分功能可供性与其他概念,预测物体上的细粒度交互可能性,并实现跨领域泛化。