Machine Learning methods, such as those from the Reinforcement Learning (RL) literature, have increasingly been applied to robot control problems. However, such control methods, even when learning environment dynamics (e.g. as in Model-Based RL/control) often remain data-inefficient. Furthermore, the decisions made by learned policies or the estimations made by learned dynamic models, unlike those made by their hand-designed counterparts, are not readily interpretable by a human user without the use of Explainable AI techniques. This has several disadvantages, such as increased difficulty both in debugging and integration in safety-critical systems. On the other hand, in many robotic systems, prior knowledge of environment kinematics and dynamics is at least partially available (e.g. from classical mechanics). Arguably, incorporating such priors to the environment model or decision process can help address the aforementioned problems: it reduces problem complexity and the needs in terms of exploration, while also facilitating the expression of the decisions taken by the agent in terms of physically meaningful entities. Our aim with this paper is to illustrate and support this point of view. We model a payload manipulation problem based on a real robotic system, and show that leveraging prior knowledge about the dynamics of the environment can lead to improved explainability and an increase in both safety and data-efficiency,leading to satisfying generalization properties with less data.
翻译:机器学习方法(如强化学习领域中的方法)已越来越多地应用于机器人控制问题。然而,此类控制方法(即使在模型基强化学习/控制中学习环境动力学时)通常仍存在数据效率低下的问题。此外,与人工设计的策略和动态模型不同,学习到的策略做出的决策或学习到的动态模型做出的估计,若不借助可解释人工智能技术,人类用户难以直接理解。这带来了若干不利之处,例如在安全关键系统中增加了调试和集成的难度。另一方面,在许多机器人系统中,环境运动学和动力学的先验知识至少部分可用(例如源于经典力学)。可以说,将这些先验融入环境模型或决策过程有助于解决上述问题:它降低了问题复杂性和探索需求,同时便于以具有物理意义的实体表达智能体做出的决策。本文旨在阐述并支持这一观点。我们基于一个真实机器人系统对负载操控问题进行建模,并展示了利用环境动力学的先验知识能够提升可解释性、安全性和数据效率,从而在少量数据下实现令人满意的泛化特性。