Human hand-object demonstrations provide dense reference motions for training dexterous manipulation reinforcement learning (RL) policies through reference tracking. However, to use such demonstrations for RL policy learning, retargeting must preserve hand pose and task-relevant hand-object contact structure. Otherwise, contact and feasibility artifacts can degrade downstream RL policy performance. We introduce TopoRetarget, an interaction-preserving retargeting framework that uses a single set of parameters across diverse retargeting conditions while maintaining task-relevant hand-object interaction and adapting human demonstrations to dexterous robot hands. The method constructs a sparse interaction graph over hand and object keypoints and optimizes distance-weighted Laplacian deformation with directional consistency, kinematic constraints, and penetration handling. Evaluations show that the generated references improve both interaction fidelity and policy learning: TopoRetarget achieves the best contact precision and alignment over all baselines on the ContactPose Dataset, improves Pen-Spin training success by 40.6 percentage points over the existing baseline methods, and enables zero-shot transfer to Wuji Hand hardware on cube reorientation and pen spinning.
翻译:人手-物体演示通过参考跟踪为训练灵巧操作强化学习策略提供了密集的参考运动。然而,要利用此类演示进行强化学习策略学习,重定向必须保持手势及任务相关的手-物体接触结构。否则,接触和可行性伪影会降低下游强化学习策略的性能。我们提出了TopoRetarget,一个交互保持重定向框架,该框架在多种重定向条件下使用单一参数集,同时维持与任务相关的手-物体交互,并将人类演示适配至灵巧机器人手。该方法构建了手和物体关键点上的稀疏交互图,并优化了具有方向一致性、运动学约束和穿透处理的距离加权拉普拉斯变形。评估表明,生成的参考同时提升了交互保真度和策略学习:在ContactPose数据集上,TopoRetarget在所有基线方法中取得了最佳接触精度和对齐;相较于现有基线方法,将笔旋转训练成功率提升了40.6个百分点;并在立方体重定向和笔旋转任务上实现了向Wuji Hand硬件的零样本迁移。