Trajectory optimization (TO) is an efficient tool to generate a redundant manipulator's joint trajectory following a 6-dimensional Cartesian path. The optimization performance largely depends on the quality of initial trajectories. However, the selection of a high-quality initial trajectory is non-trivial and requires a considerable time budget due to the extremely large space of the solution trajectories and the lack of prior knowledge about task constraints in configuration space. To alleviate the issue, we present a learning-based initial trajectory generation method that generates high-quality initial trajectories in a short time budget by adopting example-guided reinforcement learning. In addition, we suggest a null-space projected imitation reward to consider null-space constraints by efficiently learning kinematically feasible motion captured in expert demonstrations. Our statistical evaluation in simulation shows the improved optimality, efficiency, and applicability of TO when we plug in our method's output, compared with three other baselines. We also show the performance improvement and feasibility via real-world experiments with a seven-degree-of-freedom manipulator.
翻译:轨迹优化(TO)是一种生成冗余机械臂关节轨迹以跟随六维笛卡尔路径的有效工具。优化性能在很大程度上取决于初始轨迹的质量。然而,由于解轨迹空间极其庞大且在构型空间中缺乏关于任务约束的先验知识,选择高质量的初始轨迹并非易事且需要相当长的时间预算。为缓解此问题,我们提出一种基于学习的初始轨迹生成方法,该方法通过采用示例引导的强化学习,在短时间内生成高质量的初始轨迹。此外,我们提出一种零空间投影模仿奖励,通过高效学习专家演示中捕获的运动学可行运动来考虑零空间约束。我们的仿真统计评估表明,与另外三种基线方法相比,当使用我们方法的输出时,轨迹优化在最优性、效率和适用性方面均得到提升。我们还通过七自由度机械臂的真实世界实验展示了性能改进与可行性。