Finding an efficient way to adapt robot trajectory is a priority to improve overall performance of robots. One approach for trajectory planning is through transferring human-like skills to robots by Learning from Demonstrations (LfD). The human demonstration is considered the target motion to mimic. However, human motion is typically optimal for human embodiment but not for robots because of the differences between human biomechanics and robot dynamics. The Dynamic Movement Primitives (DMP) framework is a viable solution for this limitation of LfD, but it requires tuning the second-order dynamics in the formulation. Our contribution is introducing a systematic method to extract the dynamic features from human demonstration to auto-tune the parameters in the DMP framework. In addition to its use with LfD, another utility of the proposed method is that it can readily be used in conjunction with Reinforcement Learning (RL) for robot training. In this way, the extracted features facilitate the transfer of human skills by allowing the robot to explore the possible trajectories more efficiently and increasing robot compliance significantly. We introduced a methodology to extract the dynamic features from multiple trajectories based on the optimization of human-likeness and similarity in the parametric space. Our method was implemented into an actual human-robot setup to extract human dynamic features and used to regenerate the robot trajectories following both LfD and RL with DMP. It resulted in a stable performance of the robot, maintaining a high degree of human-likeness based on accumulated distance error as good as the best heuristic tuning.
翻译:寻找高效适配机器人轨迹的方法对于提升机器人整体性能至关重要。一种轨迹规划途径是通过示教学习将类人技能迁移至机器人。人类示教被视为需要模仿的目标运动。然而,由于人体生物力学与机器人动力学的差异,人类运动通常仅对人类身体结构最优,并不适合机器人。动态运动基元框架为解决示教学习的这一局限提供了可行方案,但需要对其二阶动力学参数进行调节。本文提出系统性方法,从人类示教中提取动力学特征以自动调节动态运动基元框架参数。除用于示教学习外,该方法另一优势在于可便捷地与强化学习结合用于机器人训练。通过提取的动力学特征,机器人能够更高效探索潜在轨迹并显著增强柔顺性,从而促进人类技能迁移。我们提出了一种基于参数空间中类人度与相似度优化的多轨迹动力学特征提取方法。该方法已在真实人机系统中实现,用于提取人类动力学特征,并基于动态运动基元分别结合示教学习与强化学习生成机器人轨迹。实验表明,该方法使机器人保持稳定性能,在累积距离误差指标上维持了与最优启发式调参相当的较高类人度。