Real-time trajectory optimization for nonlinear constrained autonomous systems is critical and typically performed by CPU-based sequential solvers. Specifically, reliance on global sparse linear algebra or the serial nature of dynamic programming algorithms restricts the utilization of massively parallel computing architectures like GPUs. To bridge this gap, we introduce a fully GPU-native trajectory optimization framework that combines sequential convex programming with a consensus-based alternating direction method of multipliers. By applying a temporal splitting strategy, our algorithm decouples the optimization horizon into independent, per-node subproblems that execute massively in parallel. The entire process runs fully on the GPU, eliminating costly memory transfers and large-scale sparse factorizations. This architecture naturally scales to multi-trajectory optimization. We validate the solver on a quadrotor agile flight task and a Mars powered descent problem using an on-board edge computing platform. Benchmarks reveal a sustained 4x throughput speedup and a 51% reduction in energy consumption over a heavily optimized 12-core CPU baseline. Crucially, the framework saturates the hardware, maintaining over 96% active GPU utilization to achieve planning rates exceeding 100 Hz. Furthermore, we demonstrate the solver's extensibility to robust Model Predictive Control by jointly optimizing dynamically coupled scenarios under stochastic disturbances, enabling scalable and safe autonomy.
翻译:非线性约束自主系统的实时轨迹优化至关重要,通常由基于CPU的序列求解器执行。具体而言,对全局稀疏线性代数或动态规划算法串行特性的依赖,限制了如GPU这类大规模并行计算架构的利用。为弥合这一差距,我们引入了一个完全GPU原生的轨迹优化框架,该框架将序列凸规划与基于共识的交替方向乘子法相结合。通过应用时间分裂策略,我们的算法将优化时域解耦为独立的、每个节点的子问题,这些子问题可大规模并行执行。整个流程完全在GPU上运行,消除了昂贵的内存传输和大规模稀疏分解。该架构天然适用于多轨迹优化。我们在机载边缘计算平台上,通过四旋翼无人机敏捷飞行任务和火星动力下降问题验证了该求解器。基准测试显示,相较于经过深度优化的12核CPU基线,该框架实现了持续4倍的吞吐量加速和51%的能耗降低。至关重要的是,该框架能充分利用硬件,保持超过96%的GPU活跃利用率,以实现超过100 Hz的规划速率。此外,我们通过联合优化随机扰动下动态耦合的场景,展示了该求解器可扩展至鲁棒模型预测控制,从而实现可扩展且安全的自主性。