Imitation Learning (IL) is a powerful technique for intuitive robotic programming. However, ensuring the reliability of learned behaviors remains a challenge. In the context of reaching motions, a robot should consistently reach its goal, regardless of its initial conditions. To meet this requirement, IL methods often employ specialized function approximators that guarantee this property by construction. Although effective, these approaches come with a set of limitations: 1) they are unable to fully exploit the capabilities of modern Deep Neural Network (DNN) architectures, 2) some are restricted in the family of motions they can model, resulting in suboptimal IL capabilities, and 3) they require explicit extensions to account for the geometry of motions that consider orientations. To address these challenges, we introduce a novel stability loss function, drawing inspiration from the triplet loss used in the deep metric learning literature. This loss does not constrain the DNN's architecture and enables learning policies that yield accurate results. Furthermore, it is not restricted to a specific state space geometry; therefore, it can easily incorporate the geometry of the robot's state space. We provide a proof of the stability properties induced by this loss and empirically validate our method in various settings. These settings include Euclidean and non-Euclidean state spaces, as well as first-order and second-order motions, both in simulation and with real robots. More details about the experimental results can be found in: https://youtu.be/ZWKLGntCI6w.
翻译:模仿学习是一种强大的机器人直觉编程技术。然而,确保所学行为的可靠性仍具挑战性。在到达运动场景中,机器人应能在任意初始条件下稳定抵达目标。为满足这一要求,模仿学习方法常采用特殊函数逼近器,通过结构设计保证该性质。尽管有效,这些方法存在若干局限:1) 无法充分发挥现代深度神经网络架构的性能;2) 可建模的运动族受限,导致模仿学习能力次优;3) 需显式扩展才能处理包含姿态的运动几何。为解决这些问题,我们受深度度量学习中三元组损失的启发,提出一种新颖的稳定性损失函数。该损失不约束深度神经网络架构,可学习获得精确结果的策略,且不受限于特定状态空间几何,能灵活融入机器人状态空间的几何特性。我们提供了该损失函数诱导稳定性特性的理论证明,并在仿真与真实机器人实验中,对欧几里得与非欧几里得状态空间、一阶与二阶运动等多种场景进行了实证验证。更多实验结果详见:https://youtu.be/ZWKLGntCI6w。