Humanoid robotics has strong potential to transform daily service and caregiving applications. Although recent advances in general motion tracking within physics engines (GMT) have enabled virtual characters and humanoid robots to reproduce a broad range of human motions, these behaviors are primarily limited to contact-less social interactions or isolated movements. Assistive scenarios, by contrast, require continuous awareness of a human partner and rapid adaptation to their evolving posture and dynamics. In this paper, we formulate the imitation of closely interacting, force-exchanging human-human motion sequences as a multi-agent reinforcement learning problem. We jointly train partner-aware policies for both the supporter (assistant) agent and the recipient agent in a physics simulator to track assistive motion references. To make this problem tractable, we introduce a partner policies initialization scheme that transfers priors from single-human motion-tracking controllers, greatly improving exploration. We further propose dynamic reference retargeting and contact-promoting reward, which adapt the assistant's reference motion to the recipient's real-time pose and encourage physically meaningful support. We show that AssistMimic is the first method capable of successfully tracking assistive interaction motions on established benchmarks, demonstrating the benefits of a multi-agent RL formulation for physically grounded and socially aware humanoid control.
翻译:人形机器人技术具有改变日常服务和护理应用的巨大潜力。尽管物理引擎中的通用运动跟踪(GMT)技术的最新进展已使虚拟角色和人形机器人能够复现广泛的人类动作,但这些行为主要局限于无接触的社交互动或孤立运动。相比之下,辅助场景需要对人类伙伴保持持续关注,并快速适应其不断变化的姿态和动态。在本文中,我们将紧密交互、力量交换的人类-人类运动序列的模仿构建为一个多智能体强化学习问题。我们在物理模拟器中联合训练支持者(辅助)智能体和接收者智能体的伙伴感知策略,以跟踪辅助运动参考。为使该问题易于处理,我们引入了一种伙伴策略初始化方案,该方案从单人类运动跟踪控制器迁移先验知识,显著改善了探索过程。我们进一步提出了动态参考重定向和接触促进奖励机制,使辅助者的参考运动适应接收者的实时姿态,并鼓励物理上有意义的支撑。我们证明,AssistMimic是首个能够在已建立的基准测试中成功跟踪辅助交互运动的方法,展示了多智能体强化学习框架在物理基础和社交感知的人形机器人控制方面的优势。