Imitation Learning (IL) is a powerful technique for intuitive robotic programming. However, ensuring the reliability of learned behaviors remains a challenge. In the context of reaching motions, a robot should consistently reach its goal, regardless of its initial conditions. To meet this requirement, IL methods often employ specialized function approximators that guarantee this property by construction. Although effective, these approaches come with a set of limitations: 1) they are unable to fully exploit the capabilities of modern Deep Neural Network (DNN) architectures, 2) some are restricted in the family of motions they can model, resulting in suboptimal IL capabilities, and 3) they require explicit extensions to account for the geometry of motions that consider orientations. To address these challenges, we introduce a novel stability loss function, drawing inspiration from the triplet loss used in the deep metric learning literature. This loss does not constrain the DNN's architecture and enables learning policies that yield accurate results. Furthermore, it is easily adaptable to the geometry of the robot's state space. We provide a proof of the stability properties induced by this loss and empirically validate our method in various settings. These settings include Euclidean and non-Euclidean state spaces, as well as first-order and second-order motions, both in simulation and with real robots. More details about the experimental results can be found at: https://youtu.be/ZWKLGntCI6w.
翻译:模仿学习(IL)是一种实现直观机器人编程的强大技术。然而,确保所学行为的可靠性仍是一项挑战。在趋近运动场景中,机器人应始终到达目标,无论其初始条件如何。为满足这一要求,IL方法常采用专用函数逼近器,通过构造性设计保证该特性。虽然有效,但这些方法存在若干局限性:1)无法充分利用现代深度神经网络(DNN)架构的能力;2)部分方法所能建模的运动类型受限,导致IL能力次优;3)需要显式扩展以处理考虑姿态的运动几何特性。针对这些挑战,我们提出一种新颖的稳定性损失函数,其灵感来源于深度度量学习文献中使用的三元组损失。该损失不约束DNN架构,且能学习产生精确结果的策略。此外,它易于适应机器人状态空间的几何结构。我们通过数学证明该损失诱导的稳定性性质,并在多种场景中进行实验验证。这些场景包括欧几里得与非欧几里得状态空间、一阶与二阶运动,涵盖仿真环境与真实机器人。更多实验结果详情请参见:https://youtu.be/ZWKLGntCI6w。