Humanoids have the potential to be the ideal embodiment in environments designed for humans. Thanks to the structural similarity to the human body, they benefit from rich sources of demonstration data, e.g., collected via teleoperation, motion capture, or even using videos of humans performing tasks. However, distilling a policy from demonstrations is still a challenging problem. While Diffusion Policies (DPs) have shown impressive results in robotic manipulation, their applicability to locomotion and humanoid control remains underexplored. In this paper, we investigate how dataset diversity and size affect the performance of DPs for humanoid whole-body control. In a simulated IsaacGym environment, we generate synthetic demonstrations by training Adversarial Motion Prior (AMP) agents under various Domain Randomization (DR) conditions, and we compare DPs fitted to datasets of different size and diversity. Our findings show that, although DPs can achieve stable walking behavior, successful training of locomotion policies requires significantly larger and more diverse datasets compared to manipulation tasks, even in simple scenarios.
翻译:人形机器人在为人类设计的环境中具有成为理想具身的潜力。得益于其与人体结构的相似性,它们能够受益于丰富的演示数据来源,例如通过遥操作、动作捕捉甚至使用人类执行任务的视频收集的数据。然而,从演示中提炼策略仍然是一个具有挑战性的问题。尽管扩散策略在机器人操作任务中已展现出令人印象深刻的结果,但其在运动和人形机器人控制方面的适用性仍未得到充分探索。本文研究了数据集多样性和规模如何影响扩散策略在全身人形机器人控制中的性能。在模拟的IsaacGym环境中,我们通过在多种领域随机化条件下训练对抗性运动先验代理来生成合成演示,并比较了拟合到不同规模和多样性数据集的扩散策略。我们的研究结果表明,尽管扩散策略能够实现稳定的行走行为,但与操作任务相比,即使在简单场景中,成功训练运动策略也需要显著更大且更多样化的数据集。