In this paper, we propose an affordance model, which is built on Conditional Neural Processes, that can predict effect trajectories given objects, action or effect information at any time. Affordances are represented in a latent representation that combines object, action and effect channels. This model allows us to make predictions of intermediate effects expected to be obtained from partial action executions, and this capability is used to make multi-step plans that include partial actions in order to achieve goals. We first show that our model can make accurate continuous effect predictions. We compared our model with a recent LSTM-based effect predictor using an existing dataset that includes lever-up actions. Next, we showed that our model can generate accurate effect predictions for push and grasp actions. Finally, we showed that our system can generate successful multi-step plans in order to bring objects to desired positions. Importantly, the proposed system generated more accurate and effective plans with partial action executions compared to plans that only consider full action executions. Although continuous effect prediction and multi-step planning based on learning affordances have been studied in the literature, continuous affordance and effect predictions have not been utilized in making accurate and fine-grained plans.
翻译:本文提出了一种基于条件神经过程的可供性模型,该模型能够在任意时刻根据对象、动作或效果信息预测效果轨迹。可供性通过结合对象、动作与效果通道的潜在表示进行表征。该模型能够预测从部分动作执行中预期获得的中间效果,并利用这一能力制定包含部分动作的多步规划以实现目标。我们首先展示了该模型可生成精确的连续效果预测,通过现有包含杠杆上推动作的数据集,将其与基于LSTM的近期效果预测器进行了对比。其次,证明了该模型能对推拉和抓取动作生成准确的效果预测。最后,验证了该系统可通过生成成功的多步规划将对象移动至目标位置。值得关注的是,与仅考虑完整动作执行的规划相比,采用部分动作执行时,所提系统能生成更精准有效的规划。尽管基于学习可供性的连续效果预测与多步规划已有相关研究,但先前工作尚未将连续可供性与效果预测用于生成精确细粒度的规划方案。