We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions for specific body parts from multiple reference motions simultaneously and directly by leveraging the use of multiple discriminators in a GAN-like setup. In this process, there is no need of any manual work to produce composite reference motions for learning. Instead, the control policy explores by itself how the composite motions can be combined automatically. We further account for multiple task-specific rewards and train a single, multi-objective control policy. To this end, we propose a novel framework for multi-objective learning that adaptively balances the learning of disparate motions from multiple sources and multiple goal-directed control objectives. In addition, as composite motions are typically augmentations of simpler behaviors, we introduce a sample-efficient method for training composite control policies in an incremental manner, where we reuse a pre-trained policy as the meta policy and train a cooperative policy that adapts the meta one for new composite tasks. We show the applicability of our approach on a variety of challenging multi-objective tasks involving both composite motion imitation and multiple goal-directed control.
翻译:我们提出了一种用于物理模拟角色的复合与任务驱动运动控制的深度学习方法。与现有基于强化学习模仿全身运动的数据驱动方法不同,我们通过利用类似生成对抗网络(GAN)架构中的多个判别器,直接从多个参考运动同步学习特定身体部位的解耦运动。在此过程中,无需任何人工生成复合参考运动,控制策略自主探索复合运动的自动组合方式。我们进一步整合多任务特定奖励,训练单一的多目标控制策略。为此,我们提出了一种新颖的多目标学习框架,能够自适应地平衡来自多源异构运动的学习与多目标导向控制任务。此外,由于复合运动通常是简单行为的增强组合,我们引入了一种样本高效的分阶段训练方法:复用预训练策略作为元策略,并训练协作策略使其适应新的复合任务。我们通过一系列兼具复合运动模仿与多目标导向控制的挑战性任务验证了该方法的有效性。