cuRoboV2: Dynamics-Aware Motion Generation with Depth-Fused Distance Fields for High-DoF Robots

Effective robot autonomy requires motion generation that is safe, feasible, and reactive. Current methods are fragmented: fast planners output physically unexecutable trajectories, reactive controllers struggle with high-fidelity perception, and existing solvers fail on high-DoF systems. We present cuRoboV2, a unified framework with three key innovations: (1) B-spline trajectory optimization that enforces smoothness and torque limits; (2) a GPU-native TSDF/ESDF perception pipeline that generates dense signed distance fields covering the full workspace, unlike existing methods that only provide distances within sparsely allocated blocks, up to 10x faster and in 8x less memory than the state-of-the-art at manipulation scale, with up to 99% collision recall; and (3) scalable GPU-native whole-body computation, namely topology-aware kinematics, differentiable inverse dynamics, and map-reduce self-collision, that achieves up to 61x speedup while also extending to high-DoF humanoids (where previous GPU implementations fail). On benchmarks, cuRoboV2 achieves 99.7% success under 3kg payload (where baselines achieve only 72--77%), 99.6% collision-free IK on a 48-DoF humanoid (where prior methods fail entirely), and 89.5% retargeting constraint satisfaction (vs. 61% for PyRoki); these collision-free motions yield locomotion policies with 21% lower tracking error than PyRoki and 12x lower cross-seed variance than mink. A ground-up codebase redesign for discoverability enabled LLM coding assistants to author up to 73% of new modules, including hand-optimized CUDA kernels, demonstrating that well-structured robotics code can unlock productive human--LLM collaboration. Together, these advances provide a unified, dynamics-aware motion generation stack that scales from single-arm manipulators to full humanoids.

翻译：有效的机器人自主性要求运动生成具备安全性、可行性和反应性。当前方法较为零散：快速规划器输出的轨迹在物理上不可执行，反应式控制器难以处理高保真感知，而现有求解器在高自由度系统上失效。我们提出了cuRoboV2，一个具有三项关键创新的统一框架：(1) 强制执行平滑性和扭矩限制的B样条轨迹优化；(2) 一个GPU原生的TSDF/ESDF感知流水线，可生成覆盖整个工作空间的密集有符号距离场，与现有方法仅在稀疏分配的块内提供距离不同，其在操作尺度上比最先进方法快达10倍且内存占用少8倍，碰撞召回率高达99%；以及(3) 可扩展的GPU原生全身计算，即拓扑感知运动学、可微逆动力学和Map-Reduce自碰撞检测，实现了高达61倍的加速，同时可扩展到高自由度仿人机器人（此前的GPU实现在此失效）。在基准测试中，cuRoboV2在3kg负载下实现了99.7%的成功率（基线方法仅为72-77%），在48自由度仿人机器人上实现了99.6%的无碰撞逆运动学（先前方法完全失败），以及89.5%的重定向约束满足率（PyRoki为61%）；这些无碰撞运动产生的运动策略，其跟踪误差比PyRoki低21%，交叉种子方差比mink低12倍。为提升可发现性而进行的代码库彻底重新设计，使得LLM编码助手能够编写高达73%的新模块，包括手动优化的CUDA内核，这证明了结构良好的机器人代码能够开启高效的人-LLM协作。总之，这些进展提供了一个统一的、动力学感知的运动生成栈，其可扩展性从单臂机械手覆盖至完整的仿人机器人。