Future spacecraft and surface robotic missions require increasingly capable autonomy stacks for exploring challenging and unstructured domains and trajectory optimization will be a cornerstone of such autonomy stacks. However, the nonlinear optimization solvers required remain too slow for use on relatively resource constrained flight-grade computers. In this work, we turn towards amortized optimization, a learning-based technique for accelerating optimization run times, and present TOAST: Trajectory Optimization with Merit Function Warm Starts. Offline, using data collected from a simulation, we train a neural network to learn a mapping to the full primal and dual solutions given the problem parameters. Crucially, we build upon recent results from decision-focused learning and present a set of decision-focused loss functions using the notion of merit functions for optimization problems. We show that training networks with such constraint-informed losses can better encode the structure of the trajectory optimization problem and jointly learn to reconstruct the primal-dual solution while also yielding improved constraint satisfaction. Through numerical experiments on a Lunar rover problem, we demonstrate that TOAST outperforms benchmark approaches in terms of both computation times and network prediction constraint satisfaction.
翻译:未来的航天器与行星表面机器人任务需要具备更强大的自主能力栈,以探索具有挑战性的非结构化环境,而轨迹优化将成为此类自主能力栈的核心基石。然而,在资源相对受限的星载计算机上,所需的非线性优化求解器仍存在计算速度过慢的问题。本文转向摊销优化——一种基于学习加速优化运行时间的技术,并提出TOAST:基于价值函数热启动的轨迹优化。离线阶段,利用模拟仿真收集的数据,我们训练神经网络学习从问题参数到完整原始-对偶解空间的映射。关键在于,我们基于决策导向学习的最新成果,提出了使用优化问题价值函数概念构建的决策导向损失函数集。研究表明,采用此类约束信息损失函数训练的网络能更好编码轨迹优化问题的结构,在联合学习重构原始-对偶解的同时,实现更优的约束满足性。通过月球车问题的数值实验,我们证明TOAST在计算时间与网络预测的约束满足性能方面均优于基准方法。