Humans use semantic concepts such as spatial relations between objects to describe scenes and communicate tasks such as "Put the tea to the right of the cup" or "Move the plate between the fork and the spoon." Just as children, assistive robots must be able to learn the sub-symbolic meaning of such concepts from human demonstrations and instructions. We address the problem of incrementally learning geometric models of spatial relations from few demonstrations collected online during interaction with a human. Such models enable a robot to manipulate objects in order to fulfill desired spatial relations specified by verbal instructions. At the start, we assume the robot has no geometric model of spatial relations. Given a task as above, the robot requests the user to demonstrate the task once in order to create a model from a single demonstration, leveraging cylindrical probability distribution as generative representation of spatial relations. We show how this model can be updated incrementally with each new demonstration without access to past examples in a sample-efficient way using incremental maximum likelihood estimation, and demonstrate the approach on a real humanoid robot.
翻译:人类使用语义概念(如物体间的空间关系)来描述场景并传达任务,例如"把茶放到杯子右边"或"将盘子移到叉子和勺子之间"。如同儿童学习一般,辅助机器人必须能够从人类演示和指令中习得此类概念的亚符号含义。我们解决了在与人交互过程中,从少量在线采集的演示中增量学习空间关系几何模型的问题。这类模型使机器人能够操作物体,以满足口头指令所指定的空间关系。初始阶段,我们假设机器人不具备任何空间关系的几何模型。给定上述任务时,机器人请求用户演示一次任务,从而通过单次演示建立模型,并利用圆柱概率分布作为空间关系的生成式表示。我们展示了如何通过增量最大似然估计,在不访问历史样本的情况下,以样本高效的方式逐步更新该模型,并在真实类人机器人上验证了该方法。