We introduce a new algorithm for solving unconstrained discrete-time optimal control problems. Our method follows a direct multiple shooting approach, and consists of applying the SQP method together with an $\ell_2$ augmented Lagrangian primal-dual merit function. We use the LQR algorithm to efficiently solve the primal-dual Newton-KKT system. As our algorithm is a specialization of NPSQP, it inherits its generic properties, including global convergence, fast local convergence, and the lack of need for second order corrections or dimension expansions, improving on existing direct multiple shooting approaches such as acados, ALTRO, GNMS, FATROP, and FDDP. The solutions of the LQR-shaped subproblems posed by our algorithm can be be parallelized to run in time logarithmic in the number of stages, states, and controls. Moreover, as our method avoids sequential rollouts of the nonlinear dynamics, it can run in $O(1)$ parallel time per line search iteration. Therefore, this paper provides a practical, theoretically sound, and highly parallelizable (for example, with a GPU) method for solving nonlinear discrete-time optimal control problems. An open-source JAX implementation of this algorithm can be found on GitHub (joaospinto/primal_dual_ilqr).
翻译:我们提出了一种求解无约束离散时间最优控制问题的新算法。该方法遵循直接多重打靶法框架,通过应用SQP方法并结合$\ell_2$增广拉格朗日原始对偶评价函数实现。我们采用LQR算法高效求解原始对偶牛顿-KKT系统。由于本算法是NPSQP的特例化实现,它继承了NPSQP的通用性质,包括全局收敛性、快速局部收敛性,且无需二阶修正或维度扩展,从而改进了现有直接多重打靶法(如acados、ALTRO、GNMS、FATROP和FDDP)。本算法构建的LQR形式子问题的求解过程可并行化,其运行时间与阶段数、状态量和控制量的对数成正比。此外,由于本方法避免了非线性动力学的顺序展开,每次线搜索迭代可在$O(1)$并行时间内完成。因此,本文提供了一种实用、理论完备且高度可并行化(例如使用GPU)的非线性离散时间最优控制问题求解方法。该算法的开源JAX实现可在GitHub(joaospinto/primal_dual_ilqr)获取。