We present Caspar, a library that makes the power of modern GPUs more accessible in robotics and provides a state-of-the-art nonlinear GPU solver that can be applied to a wide range of different optimization problems. Caspar bridges the gap between expressive symbolic programming in Python and high-performance GPU runtimes in C++ by automatically generating optimized CUDA kernels from symbolic expressions. Building on the SymForce library, users can easily define and combine symbolic expressions, including Lie group operations, to generate custom CUDA kernels. To use Caspar as a solver, users need only define the symbolic residual functions; Caspar then uses symbolic differentiation to generate the necessary GPU kernels and interfaces to perform nonlinear optimization. In this paper, we present the core components of Caspar and showcase its performance by performing bundle adjustment on the Bundle Adjustment in the Large (BAL) dataset. We benchmark Caspar against other state-of-the-art bundle adjusters and show that it is 5 to 20 times faster than the best alternative, requires less memory, and achieves similar accuracy. This illustrates the benefit of our symbolic GPU programming approach. Caspar is released as part of SymForce and is freely available at https://github.com/symforce-org/symforce
翻译:我们提出Caspar,一个使现代GPU的强大算力更易于应用于机器人领域的库,并提供了一个可适用于多种不同优化问题的先进非线性GPU求解器。Caspar通过从符号表达式自动生成优化的CUDA内核,弥合了Python中表达性符号编程与C++中高性能GPU运行时之间的鸿沟。基于SymForce库,用户可以轻松定义和组合包括李群运算在内的符号表达式,以生成自定义CUDA内核。若将Caspar用作求解器,用户仅需定义符号残差函数;Caspar随后利用符号微分自动生成必要的GPU内核和接口以执行非线性优化。本文介绍了Caspar的核心组件,并通过在BAL数据集上执行光束法平差来展示其性能。我们将Caspar与其他先进的光束法平差器进行基准测试,结果表明其速度比最佳替代方案快5到20倍,内存需求更低且精度相当。这充分体现了我们符号GPU编程方法的优势。Caspar作为SymForce的一部分发布,可在https://github.com/symforce-org/symforce免费获取。