In this paper, we present a novel approach for learning bimanual manipulation actions from human demonstration by extracting spatial constraints between affordance regions, termed affordance constraints, of the objects involved. Affordance regions are defined as object parts that provide interaction possibilities to an agent. For example, the bottom of a bottle affords the object to be placed on a surface, while its spout affords the contained liquid to be poured. We propose a novel approach to learn changes of affordance constraints in human demonstration to construct spatial bimanual action models representing object interactions. To exploit the information encoded in these spatial bimanual action models, we formulate an optimization problem to determine optimal object configurations across multiple execution keypoints while taking into account the initial scene, the learned affordance constraints, and the robot's kinematics. We evaluate the approach in simulation with two example tasks (pouring drinks and rolling dough) and compare three different definitions of affordance constraints: (i) component-wise distances between affordance regions in Cartesian space, (ii) component-wise distances between affordance regions in cylindrical space, and (iii) degrees of satisfaction of manually defined symbolic spatial affordance constraints.
翻译:本文提出了一种通过提取物体间可供性区域(称为可供性约束)的空间约束,从人类演示中学习双手操作动作的新方法。可供性区域被定义为向智能体提供交互可能性的物体部件。例如,瓶底可供物体放置于表面,而其瓶口则可供所盛液体倾倒。我们提出了一种新方法,通过学习人类演示中可供性约束的变化,构建表征物体交互的空间双手动作模型。为利用这些空间双手动作模型中编码的信息,我们构建了一个优化问题,以在考虑初始场景、学习到的可供性约束及机器人运动学的同时,确定跨多个执行关键点的最优物体配置。我们在仿真环境中通过两个示例任务(倾倒饮料和揉压面团)评估了该方法,并比较了三种不同的可供性约束定义:(i) 笛卡尔空间中可供性区域间的分量距离,(ii) 柱坐标空间中可供性区域间的分量距离,以及 (iii) 手动定义的符号化空间可供性约束的满足度。