Transformers for Trajectory Optimization with Application to Spacecraft Rendezvous

Reliable and efficient trajectory optimization methods are a fundamental need for autonomous dynamical systems, effectively enabling applications including rocket landing, hypersonic reentry, spacecraft rendezvous, and docking. Within such safety-critical application areas, the complexity of the emerging trajectory optimization problems has motivated the application of AI-based techniques to enhance the performance of traditional approaches. However, current AI-based methods either attempt to fully replace traditional control algorithms, thus lacking constraint satisfaction guarantees and incurring in expensive simulation, or aim to solely imitate the behavior of traditional methods via supervised learning. To address these limitations, this paper proposes the Autonomous Rendezvous Transformer (ART) and assesses the capability of modern generative models to solve complex trajectory optimization problems, both from a forecasting and control standpoint. Specifically, this work assesses the capabilities of Transformers to (i) learn near-optimal policies from previously collected data, and (ii) warm-start a sequential optimizer for the solution of non-convex optimal control problems, thus guaranteeing hard constraint satisfaction. From a forecasting perspective, results highlight how ART outperforms other learning-based architectures at predicting known fuel-optimal trajectories. From a control perspective, empirical analyses show how policies learned through Transformers are able to generate near-optimal warm-starts, achieving trajectories that are (i) more fuel-efficient, (ii) obtained in fewer sequential optimizer iterations, and (iii) computed with an overall runtime comparable to benchmarks based on convex optimization.

翻译：可靠且高效的轨迹优化方法是自主动态系统的基本需求，有效推动了火箭着陆、高超声速再入、航天器交会与对接等应用的发展。在此类安全关键应用中，新兴轨迹优化问题的复杂性促使人们应用基于人工智能的技术来增强传统方法的性能。然而，当前的AI方法要么试图完全替代传统控制算法（从而缺乏约束满足保障并需要昂贵的仿真），要么旨在仅通过监督学习模仿传统方法的行为。为解决这些局限，本文提出自主交会Transformer（ART），并评估现代生成模型从预测和控制角度解决复杂轨迹优化问题的能力。具体而言，本文评估了Transformer的以下能力：(i) 从先前收集的数据中学习近似最优策略，(ii) 为求解非凸最优控制问题的序列优化器提供热启动，从而保证硬约束满足。从预测角度看，结果表明ART在预测已知燃料最优轨迹方面优于其他基于学习的架构。从控制角度看，实证分析表明，通过Transformer学习的策略能够生成近似最优的热启动，实现以下轨迹：(i) 更节省燃料，(ii) 在更少的序列优化器迭代中获得，(iii) 总运行时间与基于凸优化的基准方法相当。