Discrete optimization belongs to the set of $\mathcal{NP}$-hard problems, spanning fields such as mixed-integer programming and combinatorial optimization. A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms, which reach optimal solutions by iteratively adding inequalities known as \textit{cuts} to refine a feasible set. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. In this work, we propose a method for accelerating cutting-plane algorithms via reinforcement learning. Our approach uses learned policies as surrogates for $\mathcal{NP}$-hard elements of the cut generating procedure in a way that (i) accelerates convergence, and (ii) retains guarantees of optimality. We apply our method on two types of problems where cutting-plane algorithms are commonly used: stochastic optimization, and mixed-integer quadratic programming. We observe the benefits of our method when applied to Benders decomposition (stochastic optimization) and iterative loss approximation (quadratic programming), achieving up to $45\%$ faster average convergence when compared to modern alternative algorithms.
翻译:离散优化属于$\mathcal{NP}$-hard问题集合,涵盖混合整数规划和组合优化等领域。当前解决凸离散优化问题的标准方法是使用割平面法,该方法通过迭代添加称为\textit{割}的不等式来细化可行集,从而求得最优解。尽管存在多种通用割生成算法,大规模离散优化问题仍面临计算难解性。本文提出一种通过强化学习加速割平面法的方法。我们的方法利用学习得到的策略作为割生成过程中$\mathcal{NP}$-hard元素的替代方案,实现:(i)加速收敛,(ii)保留最优性保证。我们将该方法应用于割平面法常用的两类问题:随机优化和混合整数二次规划。在Benders分解(随机优化)和迭代损失近似(二次规划)中,我们观察到该方法相较于现代替代算法实现了最高$45\%$的平均收敛加速。