Deep FlexQP: Accelerated Nonlinear Programming via Deep Unfolding

We propose FlexQP, an always-feasible convex quadratic programming (QP) solver based on an $\ell_1$ elastic relaxation of the QP constraints. If the original constraints are feasible, FlexQP provably recovers the optimal solution. If the constraints are infeasible, FlexQP identifies a solution that minimizes the constraint violation while keeping the number of violated constraints sparse. Such infeasibilities arise naturally in sequential quadratic programming (SQP) subproblems due to the linearization of the constraints. We prove the convergence of FlexQP under mild coercivity assumptions, making it robust to both feasible and infeasible QPs. We then apply deep unfolding to learn LSTM-based, dimension-agnostic feedback policies for the algorithm parameters, yielding an accelerated Deep FlexQP. To preserve the exactness guarantees of the relaxation, we propose a normalized training loss that incorporates the Lagrange multipliers. We additionally design a log-scaled loss for PAC-Bayes generalization bounds that yields substantially tighter performance certificates, which we use to construct an accelerated SQP solver with guaranteed QP subproblem performance. Deep FlexQP outperforms state-of-the-art learned QP solvers on a suite of benchmarks including portfolio optimization, classification, and regression problems, and scales to dense QPs with over 10k variables and constraints via fine-tuning. When deployed within SQP, our approach solves nonlinear trajectory optimization problems 4-16x faster than SQP with OSQP while substantially improving success rates. On predictive safety filter problems, Deep FlexQP reduces safety violations by over 70\% and increases task completion by 43\% compared to existing methods.

翻译：我们提出FlexQP，一种始终可行的凸二次规划（QP）求解器，其基于对QP约束的$\ell_1$弹性松弛。如果原始约束可行，FlexQP可证明地恢复最优解。如果约束不可行，FlexQP则能识别出一个在保持违反约束稀疏性的同时最小化约束违反程度的解。此类不可行性在序列二次规划（SQP）子问题中由于约束的线性化而自然产生。我们在温和的强制性假设下证明了FlexQP的收敛性，使其对可行和不可行的QP问题均具有鲁棒性。随后，我们应用深度展开技术，为算法参数学习基于LSTM的、维度无关的反馈策略，从而得到一个加速的深度FlexQP。为了保持松弛的精确性保证，我们提出了一种包含拉格朗日乘子的归一化训练损失。此外，我们设计了一种用于PAC-Bayes泛化界的对数尺度损失，该损失能产生显著更严格的性能证书，我们利用该证书构建了一个具有保证QP子问题性能的加速SQP求解器。深度FlexQP在一系列基准测试（包括投资组合优化、分类和回归问题）上优于最先进的已学习QP求解器，并通过微调扩展到具有超过10k个变量和约束的稠密QP问题。当部署在SQP内部时，我们的方法解决非线性轨迹优化问题的速度比使用OSQP的SQP快4-16倍，同时显著提高了成功率。在预测性安全滤波器问题上，与现有方法相比，深度FlexQP将安全违规减少了70%以上，并将任务完成率提高了43%。