基准标记外骨骼：以图像为中心的机器人状态估计 (Fiducial Exoskeletons: Image-Centric Robot State Estimation)

We introduce Fiducial Exoskeletons, an image-based reformulation of 3D robot state estimation that replaces cumbersome procedures and motor-centric pipelines with single-image inference. Traditional approaches - especially robot-camera extrinsic estimation - often rely on high-precision actuators and require time-consuming routines such as hand-eye calibration. In contrast, modern learning-based robot control is increasingly trained and deployed from RGB observations on lower-cost hardware. Our key insight is twofold. First, we cast robot state estimation as 6D pose estimation of each link from a single RGB image: the robot-camera base transform is obtained directly as the estimated base-link pose, and the joint state is recovered via a lightweight global optimization that enforces kinematic consistency with the observed link poses (optionally warm-started with encoder readings). Second, we make per-link 6D pose estimation robust and simple - even without learning - by introducing the fiducial exoskeleton: a lightweight 3D-printed mount with a fiducial marker on each link and known marker-link geometry. This design yields robust camera-robot extrinsics, per-link SE(3) poses, and joint-angle state from a single image, enabling robust state estimation even on unplugged robots. Demonstrated on a low-cost robot arm, fiducial exoskeletons substantially simplify setup while improving calibration, state accuracy, and downstream 3D control performance. We release code and printable hardware designs to enable further algorithm-hardware co-design.

翻译：我们提出基准标记外骨骼，这是一种基于图像的机器人三维状态估计重构方法，它用单图像推理取代了繁琐的流程和以电机为中心的流水线。传统方法——特别是机器人-相机外参估计——通常依赖高精度执行器，并需要耗时的流程，如手眼标定。相比之下，基于学习的现代机器人控制越来越多地在低成本硬件上通过RGB观测进行训练和部署。我们的核心见解包含两个方面。首先，我们将机器人状态估计重构为从单张RGB图像估计每个连杆的6D位姿：机器人-相机基座变换直接通过估计的基座连杆位姿获得，关节状态则通过一个轻量级的全局优化恢复，该优化强制与观测到的连杆位姿保持运动学一致性（可选择以编码器读数进行热启动）。其次，我们通过引入基准标记外骨骼，使每个连杆的6D位姿估计变得鲁棒且简单——即使无需学习：这是一种轻量级的3D打印支架，每个连杆上附有一个基准标记，并具有已知的标记-连杆几何关系。该设计仅需单张图像即可实现鲁棒的相机-机器人外参、每个连杆的SE(3)位姿以及关节角状态，从而即使在断电的机器人上也能实现鲁棒的状态估计。在低成本机械臂上的实验表明，基准标记外骨骼在显著简化设置的同时，提升了标定精度、状态准确性以及下游三维控制性能。我们发布了代码和可打印的硬件设计，以促进进一步的算法-硬件协同设计。