Scalable Deep Unfolding of Conic Optimizers

Deep unfolding (DU) accelerates iterative optimizers by introducing learnable components and training them through unrolled iterations, but extending DU to the large-scale semidefinite programs (SDPs) common in robotics has remained limited. Unrolling a full-update conic solver such as COSMO exposes two obstacles that prior work on learned conic solvers has not: backpropagating through the per-iteration linear-system solve incurs memory quadratic in the problem size once the coefficient matrix is formed explicitly, and backpropagating through the positive semidefinite (PSD) cone projection becomes numerically unstable when eigenvalues coincide. We address the first obstacle with a matrix-free implicit differentiation rule that operates entirely through matrix-vector products, reducing memory from $O(n^2)$ to $O(n)$ and enabling backpropagation at scales where direct factorization runs out of memory. We address the second with a backward rule based on the Dalečkii--Krein representation of the Fréchet derivative, which remains well-defined under repeated eigenvalues. Together these make it possible to learn lightweight hyperparameter policies and warm-starts for a full-update conic solver. We evaluate on nonlinear covariance steering problems solved via sequential convex programming (SCP), as well as standalone SDPs and second-order cone programs ranging from max-cut and Lovász $\vartheta$ SDPs to robust estimation and control problems. The learned policies outperform state-of-the-art solvers across all problems, and can provide up to a 50$\times$ speedup depending on the class. When used as a subroutine in SCP, the learned approach delivers over a 30$\times$ speedup compared to COSMO.

翻译：深度展开（DU）通过引入可学习组件并利用展开迭代进行训练来加速迭代优化器，然而将其扩展到机器人领域常见的大规模半定规划（SDP）仍存在限制。对COSMO这类全更新锥形求解器进行展开暴露了先前学习型锥形求解器工作未涉及的两个障碍：反传每次迭代的线性系统求解时，一旦显式形成系数矩阵，内存成本与问题规模呈二次关系；反传正半定（PSD）锥投影时，若特征值重合则数值稳定性失效。我们通过无矩阵隐式微分规则解决第一个障碍——该规则完全基于矩阵-向量乘积操作，将内存从$O(n^2)$降至$O(n)$，从而在直接因式分解导致内存溢出的规模下实现反传。针对第二个障碍，我们提出基于Frechet导数的Dalečkii–Krein表示的反传规则，该规则在重复特征值条件下仍保持良定义。两者结合使得学习全更新锥形求解器的轻量级超参数策略与热启动成为可能。我们在通过序列凸规划（SCP）求解的非线性协方差控制问题，以及涵盖Max Cut、Lovász $\vartheta$ SDP到鲁棒估计与控制问题的独立SDP与二阶锥规划上进行了评估。学习策略在所有问题上均超越现有最优求解器，根据问题类型可实现最高50倍的加速。当作为SCP子程序使用时，相比COSMO该学习型方法带来超30倍的加速。