Imitation Learning (IL) is a powerful technique for intuitive robotic programming. However, ensuring the reliability of learned behaviors remains a challenge. In the context of reaching motions, a robot should consistently reach its goal, regardless of its initial conditions. To meet this requirement, IL methods often employ specialized function approximators that guarantee this property by construction. Although effective, these approaches come with a set of limitations: 1) they are unable to fully exploit the capabilities of modern Deep Neural Network (DNN) architectures, 2) some are restricted in the family of motions they can model, resulting in suboptimal IL capabilities, and 3) they require explicit extensions to account for the geometry of motions that consider orientations. To address these challenges, we introduce a novel stability loss function, drawing inspiration from the triplet loss used in the deep metric learning literature. This loss does not constrain the DNN's architecture and enables learning policies that yield accurate results. Furthermore, it is not restricted to a specific state space geometry; therefore, it can easily incorporate the geometry of the robot's state space. We provide a proof of the stability properties induced by this loss and empirically validate our method in various settings. These settings include Euclidean and non-Euclidean state spaces, as well as first-order and second-order motions, both in simulation and with real robots. More details about the experimental results can be found in: https://youtu.be/ZWKLGntCI6w.
翻译:模仿学习(IL)是直觉式机器人编程的有效技术。然而,确保习得行为的可靠性仍是一项挑战。在到达运动场景下,机器人应能不受初始条件影响,始终稳定抵达目标。为满足这一要求,现有IL方法常采用能通过构造保证该特性的专用函数逼近器。尽管有效,这些方法存在以下局限:1)无法充分发挥现代深度神经网络(DNN)架构的能力;2)部分方法仅能建模特定运动族,导致IL能力次优;3)需要显式扩展以处理考虑姿态的运动几何结构。为应对这些挑战,我们提出一种新型稳定性损失函数,其灵感源于深度度量学习中的三元组损失。该损失函数不约束DNN架构,可学习产生精确结果的策略。此外,该方法不受特定状态空间几何约束,因此能轻松融合机器人状态空间的几何特性。我们给出了该损失诱导稳定性性质的证明,并在多种场景下进行实验验证。这些场景包括欧几里得与非欧几里得状态空间、一阶与二阶运动系统,涵盖仿真与真实机器人实验。更多实验结果详见:https://youtu.be/ZWKLGntCI6w。