Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multi-modal skills. However, naively training a diffusion policy often results in unstable motions for high-frequency, under-actuated control tasks like bipedal locomotion due to rapidly accumulating compounding errors, pushing the agent away from optimal training trajectories. The key idea lies in using RL policies not just for providing optimal trajectories but for providing corrective actions in sub-optimal states, giving the policy a chance to correct for errors caused by environmental stimulus, model errors, or numerical errors in simulation. Our method, Physics-Based Character Animation via Diffusion Policy (PDP), combines reinforcement learning (RL) and behavior cloning (BC) to create a robust diffusion policy for physics-based character animation. We demonstrate PDP on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis.
翻译:生成多样化且真实、能与环境进行物理交互的人体运动,仍然是角色动画领域一个具有挑战性的研究方向。与此同时,机器人学界提出的基于扩散的方法已展现出捕捉高度多样化、多模态技能的能力。然而,对于像双足运动这类高频、欠驱动的控制任务,直接训练扩散策略常因快速累积的复合误差而导致运动不稳定,使智能体偏离最优训练轨迹。核心思想在于,利用强化学习策略不仅提供最优轨迹,还在次优状态下提供纠正动作,使策略有机会纠正由环境刺激、模型误差或仿真数值误差引起的错误。我们的方法——基于物理的角色动画扩散策略,结合了强化学习与行为克隆,以创建一个鲁棒的、用于基于物理的角色动画的扩散策略。我们在扰动恢复、通用运动跟踪以及基于物理的文本到运动合成等任务上验证了PDP的有效性。