We introduce RoboBrain 2.5, a next-generation embodied AI foundation model that advances general perception, spatial reasoning, and temporal modeling through extensive training on high-quality spatiotemporal supervision. Building upon its predecessor, RoboBrain 2.5 introduces two major capability upgrades. Specifically, it unlocks Precise 3D Spatial Reasoning by shifting from 2D pixel-relative grounding to depth-aware coordinate prediction and absolute metric constraint comprehension, generating complete 3D manipulation traces as ordered keypoint sequences under physical constraints. Complementing this spatial precision, the model establishes Dense Temporal Value Estimation that provides dense, step-aware progress prediction and execution state understanding across varying viewpoints, producing stable feedback signals for downstream learning. Together, these upgrades extend the framework toward more physically grounded and execution-aware embodied intelligence for complex, fine-grained manipulation. The code and checkpoints are available at project website: https://superrobobrain.github.io
翻译:我们介绍了 RoboBrain 2.5,这是一个新一代具身人工智能基础模型,它通过在大规模高质量时空监督数据上进行训练,显著提升了通用感知、空间推理和时序建模能力。在上一代模型的基础上,RoboBrain 2.5 引入了两大核心能力升级。具体而言,它通过从 2D 像素相对定位转向深度感知的坐标预测和绝对度量约束理解,实现了精确的三维空间推理,能够在物理约束下生成完整的、以有序关键点序列表示的三维操作轨迹。与此空间精度相辅相成,该模型建立了稠密时序价值估计,能够提供密集的、步骤感知的进度预测和跨不同视角的执行状态理解,从而为下游学习生成稳定的反馈信号。这些升级共同将该框架推向更具物理基础和执行感知能力的具身智能,以应对复杂、细粒度的操作任务。代码与模型检查点可在项目网站获取:https://superrobobrain.github.io