This paper presents a mathematical formulation to perform temporal parallelisation of continuous-time optimal control problems, which can be solved via the Hamilton--Jacobi--Bellman (HJB) equation. We divide the time interval of the control problem into sub-intervals, and define a control problem in each sub-interval, conditioned on the start and end states, leading to conditional value functions for the sub-intervals. By defining an associative operator as the minimisation of the sum of conditional value functions, we obtain the elements and associative operators for a parallel associative scan operation. This allows for solving the optimal control problem on the whole time interval in parallel in logarithmic time complexity in the number of sub-intervals. We derive the HJB-type of backward and forward equations for the conditional value functions and solve them in closed form for linear quadratic problems. We also discuss numerical methods for computing the conditional value functions. The computational advantages of the proposed parallel methods are demonstrated via simulations run on a multi-core central processing unit and a graphics processing unit.
翻译:本文提出了一种对连续时间最优控制问题进行时间并行化的数学框架,该问题可通过Hamilton-Jacobi-Bellman(HJB)方程求解。我们将控制问题的时间区间划分为若干子区间,并在每个子区间内定义以起止状态为条件的控制问题,从而得到子区间的条件值函数。通过将关联算子定义为条件值函数和的最小化运算,我们获得了并行关联扫描运算所需的元素与算子。这使得在子区间数量上以对数时间复杂度并行求解整个时间区间的最优控制问题成为可能。我们推导了条件值函数的HJB型前向与后向方程,并针对线性二次问题给出了闭式解。同时讨论了计算条件值函数的数值方法。通过在多核中央处理器与图形处理器上的仿真实验,验证了所提并行方法的计算优势。