Contact forces introduce discontinuities into robot dynamics that severely limit the use of simulators for gradient-based optimization. Penalty-based simulators such as MuJoCo, soften contact resolution to enable gradient computation. However, realistically simulating hard contacts requires stiff solver settings, which leads to incorrect simulator gradients when using automatic differentiation. Contrarily, using non-stiff settings strongly increases the sim-to-real gap. We analyze penalty-based simulators to pinpoint why gradients degrade under hard contacts. Building on these insights, we propose DiffMJX, which couples adaptive time integration with penalty-based simulation to substantially improve gradient accuracy. A second challenge is that contact gradients vanish when bodies separate. To address this, we introduce contacts from distance (CFD) which combines penalty-based simulation with straight-through estimation. By applying CFD exclusively in the backward pass, we obtain informative pre-contact gradients while retaining physical realism.
翻译:接触力在机器人动力学中引入了不连续性,严重限制了基于梯度的优化方法在仿真器中的使用。以MuJoCo为代表的基于惩罚的仿真器通过软化接触求解来实现梯度计算。然而,真实模拟硬接触需要刚性的求解器设置,这导致使用自动微分时会产生错误的仿真器梯度。相反,采用非刚性设置会显著增大仿真到现实的差距。我们分析了基于惩罚的仿真器,以精确定位硬接触条件下梯度退化的根本原因。基于这些见解,我们提出了DiffMJX,该方法将自适应时间积分与基于惩罚的仿真相结合,显著提高了梯度精度。第二个挑战在于:当物体分离时接触梯度会消失。为解决此问题,我们引入了基于距离的接触(CFD)方法,它将基于惩罚的仿真与直通估计相结合。通过在反向传播中专门应用CFD,我们能够在保持物理真实感的同时获得具有信息量的预接触梯度。