Linear-quadratic regulator (LQR) is a landmark problem in the field of optimal control, which is the concern of this paper. Generally, LQR is classified into state-feedback LQR (SLQR) and output-feedback LQR (OLQR) based on whether the full state is obtained. It has been suggested in existing literature that both the SLQR and the OLQR could be viewed as \textit{constrained nonconvex matrix optimization} problems in which the only variable to be optimized is the feedback gain matrix. In this paper, we introduce a first-order accelerated optimization framework of handling the LQR problem, and give its convergence analysis for the cases of SLQR and OLQR, respectively. Specifically, a Lipschiz Hessian property of LQR performance criterion is presented, which turns out to be a crucial property for the application of modern optimization techniques. For the SLQR problem, a continuous-time hybrid dynamic system is introduced, whose solution trajectory is shown to converge exponentially to the optimal feedback gain with Nesterov-optimal order $1-\frac{1}{\sqrt{\kappa}}$ ($\kappa$ the condition number). Then, the symplectic Euler scheme is utilized to discretize the hybrid dynamic system, and a Nesterov-type method with a restarting rule is proposed that preserves the continuous-time convergence rate, i.e., the discretized algorithm admits the Nesterov-optimal convergence order. For the OLQR problem, a Hessian-free accelerated framework is proposed, which is a two-procedure method consisting of semiconvex function optimization and negative curvature exploitation. In a time $\mathcal{O}(\epsilon^{-7/4}\log(1/\epsilon))$, the method can find an $\epsilon$-stationary point of the performance criterion; this entails that the method improves upon the $\mathcal{O}(\epsilon^{-2})$ complexity of vanilla gradient descent. Moreover, our method provides the second-order guarantee of stationary point.
翻译:线性二次型调节器(LQR)是最优控制领域中的一个里程碑式问题,这也是本文的研究重点。通常,根据是否获取完整状态,LQR可分为状态反馈LQR(SLQR)和输出反馈LQR(OLQR)。现有文献表明,SLQR和OLQR均可被视为一种**约束非凸矩阵优化**问题,其中唯一待优化的变量是反馈增益矩阵。本文提出了一种处理LQR问题的一阶加速优化框架,并分别针对SLQR和OLQR情形给出了其收敛性分析。具体而言,我们揭示了LQR性能准则的Lipschitz Hessian性质,这一性质对于现代优化技术的应用至关重要。针对SLQR问题,我们引入了一个连续时间混合动力系统,其解轨迹以Nesterov最优阶$1-\frac{1}{\sqrt{\kappa}}$($\kappa$为条件数)指数收敛至最优反馈增益。随后,利用辛欧拉格式对该混合动力系统进行离散化,并提出了一种带重启规则的Nesterov型方法,该方法保留了连续时间的收敛速率,即离散化算法具有Nesterov最优收敛阶。针对OLQR问题,我们提出了一个免Hessian的加速框架,该框架采用两阶段流程,包含半凸函数优化与负曲率利用。该方法在时间$\mathcal{O}(\epsilon^{-7/4}\log(1/\epsilon))$内可以找到性能准则的$\epsilon$-稳定点;这意味着该方法相比原始梯度下降的$\mathcal{O}(\epsilon^{-2})$复杂度有所改进。此外,我们的方法还提供了稳定点的二阶保证。