In this study, we propose a multitask reinforcement learning algorithm for foundational policy acquisition to generate novel motor skills. Inspired by human sensorimotor adaptation mechanisms, we aim to train encoder-decoder networks that can be commonly used to learn novel motor skills in a single movement category. To train the policy network, we develop the multitask reinforcement learning method, where the policy needs to cope with changes in goals or environments with different reward functions or physical parameters of the environment in dynamic movement generation tasks. Here, as a concrete task, we evaluated the proposed method with the ball heading task using a monopod robot model. The results showed that the proposed method could adapt to novel target positions or inexperienced ball restitution coefficients. Furthermore, we demonstrated that the acquired foundational policy network originally learned for heading motion, can be used to generate an entirely new overhead kicking skill.
翻译:本研究提出一种用于基础策略获取的多任务强化学习算法,以生成新颖的运动技能。受人类感觉运动适应机制启发,我们旨在训练编码器-解码器网络,使其能够通用地学习单一运动类别中的新技能。为训练策略网络,我们开发了多任务强化学习方法,该方法需应对动态运动生成任务中目标变化或环境改变(包括不同奖励函数或环境物理参数)的挑战。具体而言,我们以单足机器人模型的顶球任务作为实例进行方法评估。结果表明,该方法能适应新目标位置或未经训练的弹性系数场景。进一步地,我们验证了原本针对顶头运动学习得到的基础策略网络,可成功迁移以生成全新的过顶踢球技能。