A Data-Driven Approach to Synthesizing Dynamics-Aware Trajectories for Underactuated Robotic Systems

We consider joint trajectory generation and tracking control for under-actuated robotic systems. A common solution is to use a layered control architecture, where the top layer uses a simplified model of system dynamics for trajectory generation, and the low layer ensures approximate tracking of this trajectory via feedback control. While such layered control architectures are standard and work well in practice, selecting the simplified model used for trajectory generation typically relies on engineering intuition and experience. In this paper, we propose an alternative data-driven approach to dynamics-aware trajectory generation. We show that a suitable augmented Lagrangian reformulation of a global nonlinear optimal control problem results in a layered decomposition of the overall problem into trajectory planning and feedback control layers. Crucially, the resulting trajectory optimization is dynamics-aware, in that, it is modified with a tracking penalty regularizer encoding the dynamic feasibility of the generated trajectory. We show that this tracking penalty regularizer can be learned from system rollouts for independently-designed low layer feedback control policies, and instantiate our framework in the context of a unicycle and a quadrotor control problem in simulation. Further, we show that our approach handles the sim-to-real gap through experiments on the quadrotor hardware platform without any additional training. For both the synthetic unicycle example and the quadrotor system, our framework shows significant improvements in both computation time and dynamic feasibility in simulation and hardware experiments.

翻译：针对欠驱动机器人系统的联合轨迹生成与跟踪控制问题，常见的解决方案采用分层控制架构：顶层利用系统动力学的简化模型进行轨迹规划，底层通过反馈控制确保该轨迹的近似跟踪。虽然这种分层控制架构是标准方法且在实际应用中表现良好，但选择用于轨迹生成的简化模型通常依赖于工程直觉与经验。本文提出一种替代性的数据驱动方法以生成动力学感知轨迹。研究表明，全局非线性最优控制问题的适当增广拉格朗日重构，可将整体问题分解为轨迹规划与反馈控制两个分层。关键在于，所得到的轨迹优化具有动力学感知能力，即通过编码生成轨迹动态可行性的跟踪惩罚正则化项进行修正。我们证明该跟踪惩罚正则化项可通过系统展开独立设计的底层反馈控制策略进行学习，并在模拟环境下以单轮车和四旋翼飞行器控制问题为背景实例化框架。进一步，通过四旋翼硬件平台实验（无需额外训练）验证了该方法能够处理仿真到现实的差异。在合成单轮车示例与四旋翼系统中，本框架在计算时间与动态可行性方面，于仿真与硬件实验中均展现出显著提升。