Sequentially solving similar optimization problems under strict runtime constraints is essential for many applications, such as robot control, autonomous driving, and portfolio management. The performance of local optimization methods in these settings is sensitive to the initial solution: poor initialization can lead to slow convergence or suboptimal solutions. To address this challenge, we propose learning to predict \emph{multiple} diverse initial solutions given parameters that define the problem instance. We introduce two strategies for utilizing multiple initial solutions: (i) a single-optimizer approach, where the most promising initial solution is chosen using a selection function, and (ii) a multiple-optimizers approach, where several optimizers, potentially run in parallel, are each initialized with a different solution, with the best solution chosen afterward. We validate our method on three optimal control benchmark tasks: cart-pole, reacher, and autonomous driving, using different optimizers: DDP, MPPI, and iLQR. We find significant and consistent improvement with our method across all evaluation settings and demonstrate that it efficiently scales with the number of initial solutions required. The code is available at $\href{https://github.com/EladSharony/miso}{\tt{https://github.com/EladSharony/miso}}$.
翻译:在严格运行时约束下顺序求解相似优化问题对于机器人控制、自动驾驶和投资组合管理等诸多应用至关重要。在这些场景中,局部优化方法的性能对初始解极为敏感:不良的初始化可能导致收敛缓慢或陷入次优解。为应对这一挑战,我们提出一种学习方法,能够根据定义问题实例的参数预测多个多样化的初始解。我们引入了两种利用多个初始解的策略:(i) 单优化器方法:通过选择函数选取最具潜力的初始解;(ii) 多优化器方法:多个优化器(可并行运行)分别采用不同的初始解进行初始化,最终选取最优解。我们在三个最优控制基准任务(倒立摆、机械臂到达和自动驾驶)上验证了所提方法,并采用DDP、MPPI和iLQR三种优化器进行测试。实验结果表明,在所有评估场景中我们的方法均带来显著且一致的性能提升,并证明其能随着所需初始解数量的增加而高效扩展。代码发布于 $\href{https://github.com/EladSharony/miso}{\tt{https://github.com/EladSharony/miso}}$。