RAPT：面向仿真到现实人形机器人的模型预测性分布外检测与故障诊断 (RAPT: Model-Predictive Out-of-Distribution Detection and Failure Diagnosis for Sim-to-Real Humanoid Robots)

Deploying learned control policies on humanoid robots is challenging: policies that appear robust in simulation can execute confidently in out-of-distribution (OOD) states after Sim-to-Real transfer, leading to silent failures that risk hardware damage. Although anomaly detection can mitigate these failures, prior methods are often incompatible with high-rate control, poorly calibrated at the extremely low false-positive rates required for practical deployment, or operate as black boxes that provide a binary stop signal without explaining why the robot drifted from nominal behavior. We present RAPT, a lightweight, self-supervised deployment-time monitor for 50Hz humanoid control. RAPT learns a probabilistic spatio-temporal manifold of nominal execution from simulation and evaluates execution-time predictive deviation as a calibrated, per-dimension signal. This yields (i) reliable online OOD detection under strict false-positive constraints and (ii) a continuous, interpretable measure of Sim-to-Real mismatch that can be tracked over time to quantify how far deployment has drifted from training. Beyond detection, we introduce an automated post-hoc root-cause analysis pipeline that combines gradient-based temporal saliency derived from RAPT's reconstruction objective with LLM-based reasoning conditioned on saliency and joint kinematics to produce semantic failure diagnoses in a zero-shot setting. We evaluate RAPT on a Unitree G1 humanoid across four complex tasks in simulation and on physical hardware. In large-scale simulation, RAPT improves True Positive Rate (TPR) by 37% over the strongest baseline at a fixed episode-level false positive rate of 0.5%. On real-world deployments, RAPT achieves a 12.5% TPR improvement and provides actionable interpretability, reaching 75% root-cause classification accuracy across 16 real-world failures using only proprioceptive data.

翻译：将学习到的控制策略部署于人形机器人具有挑战性：在仿真中表现鲁棒的策略，在仿真到现实迁移后，可能会在分布外状态下自信地执行，导致静默故障，从而带来硬件损坏风险。尽管异常检测可以缓解这些故障，但现有方法通常与高速率控制不兼容，在实际部署所需的极低误报率下校准不佳，或者作为黑箱运行，仅提供二元停止信号而无法解释机器人偏离标称行为的原因。我们提出了RAPT，一种用于50Hz人形机器人控制的轻量级、自监督部署时监控器。RAPT从仿真中学习标称执行的概率时空流形，并将执行时的预测偏差评估为经过校准的、按维度输出的信号。这实现了（i）在严格的误报约束下可靠的在线分布外检测，以及（ii）一个连续的、可解释的仿真到现实失配度量，该度量可随时间追踪以量化部署偏离训练的程度。除了检测，我们还引入了一个自动的事后根因分析流程，该流程结合了源自RAPT重建目标的基于梯度的时序显著性，以及基于显著性信息和关节运动学进行条件化的大语言模型推理，从而在零样本设置下生成语义化的故障诊断。我们在Unitree G1人形机器人上，针对仿真和物理硬件中的四项复杂任务对RAPT进行了评估。在大规模仿真中，在固定的0.5%情节级误报率下，RAPT将真阳性率比最强基线提高了37%。在现实世界部署中，RAPT实现了12.5%的真阳性率提升，并提供了可操作的模型可解释性，仅使用本体感知数据，在16个现实世界故障中达到了75%的根因分类准确率。