Human demonstrations provide strong priors for robot manipulation, yet it is non-trivial to transfer them to execute on real robots due to the kinematic gap. In dexterous manipulation, it remains challenging to track long-horizon, contact-rich sequences even in simulators: a reference-tracking policy must keep objects on their target trajectories while preserving demonstrated joint motion and contact timing. Existing approaches often rely on hand-crafted reward tuning that require per-sequence tuning and break under limited interaction budgets. We introduce ConTrack, a reinforcement learning (RL) framework that scales with tracking data. ConTrack treats object tracking as a constraint and allocates remaining control authority to motion fidelity, which allows it to adapt task--style trade-offs online using a dual-variable update. In addition, ConTrack also stabilizes long-horizon learning with an adaptive mid-trajectory reset library that reuses policy-reachable simulator states. Our qualitative and quantitative results in simulation tracking and real robot demonstrate that ConTrack improves success and object pose accuracy significantly over prior arts while preserving joint and contact fidelity. Website: https://www.lyt0112.com/projects/ConTrack.
翻译:人类演示为机器人操作提供了强大的先验知识,但由于运动学差异,将其迁移至真实机器人执行并非易事。在灵巧操作中,即使是模拟器,跟踪长时域、高接触序列仍具挑战性:参考跟踪策略必须保持物体沿目标轨迹运动,同时保留演示的关节运动和接触时序。现有方法通常依赖针对每条序列手动调整的奖励函数,且在有限交互预算下容易失效。我们提出ConTrack,一种可随跟踪数据规模扩展的强化学习框架。ConTrack将物体跟踪视为约束条件,并将剩余控制权限分配给运动保真度,通过双变量更新实现任务与风格间的在线自适应权衡。此外,ConTrack还利用自适应中间轨迹重置库(复用策略可达的模拟器状态)稳定长时域学习。在模拟跟踪与真实机器人上的定性和定量结果表明,相较现有方法,ConTrack在显著提升成功率与物体位姿精度的同时,保持了关节与接触保真度。网站:https://www.lyt0112.com/projects/ConTrack。