While Model Predictive Control (MPC) delivers strong performance across robotics applications, solving the underlying (batches of) nonlinear trajectory optimization (TO) problems online remains computationally demanding. Existing GPU-accelerated approaches either parallelize single solves, handle large batches at sub-real-time rates, or sacrifice model generality for speed. This leaves a large gap in solver performance for many state-of-the-art MPC applications that require real-time batches of tens to low-hundreds of solves. As such, we present GATO, an open source, GPU-accelerated, batched TO solver co-designed across algorithm, software, and computational hardware to deliver real-time throughput for these moderate batch size regimes. Our approach leverages a combination of block-, warp-, and thread-level parallelism within and across solves for ultra-high performance. We demonstrate the effectiveness of our approach through a combination of: simulated benchmarks showing speedups of 18-21x over CPU baselines and 1.4-16x over GPU baselines as batch size increases; case studies highlighting improved disturbance rejection and convergence behavior; and finally a validation on hardware using an industrial manipulator. We open source GATO to support reproducibility and adoption.
翻译:尽管模型预测控制在机器人应用中表现出色,但在线求解底层(批量的)非线性轨迹优化问题在计算上仍具挑战性。现有GPU加速方法要么并行化单个求解过程,要么以低于实时的速率处理大批量任务,要么为追求速度牺牲模型通用性。这导致许多需要实时处理十至数百个求解批量的最先进MPC应用存在求解器性能的显著缺口。为此,我们提出开源工具GATO——一种跨算法、软件与计算硬件协同设计的GPU加速批处理轨迹优化求解器,旨在为中等批量场景提供实时吞吐能力。我们通过跨求解任务及求解内部的块级、线程束级与线程级并行策略实现超高性能。通过三项评估验证方法有效性:模拟基准测试表明,随批量增大,相比CPU基线加速18-21倍,相比GPU基线加速1.4-16倍;案例研究凸显改进的抗扰与收敛性能;最后在工业机械臂硬件上完成实验验证。我们开源GATO以支持可复现性与应用推广。