Metabolic energy consumption of a powered lower-limb exoskeleton user mainly comes from the upper body effort since the lower body is considered to be passive. However, the upper body effort of the users is largely ignored in the literature when designing motion controllers. In this work, we use deep reinforcement learning to develop a locomotion controller that minimizes ground reaction forces (GRF) on crutches. The rationale for minimizing GRF is to reduce the upper body effort of the user. Accordingly, we design a model and a learning framework for a human-exoskeleton system with crutches. We formulate a reward function to encourage the forward displacement of a human-exoskeleton system while satisfying the predetermined constraints of a physical robot. We evaluate our new framework using Proximal Policy Optimization, a state-of-the-art deep reinforcement learning (RL) method, on the MuJoCo physics simulator with different hyperparameters and network architectures over multiple trials. We empirically show that our learning model can generate joint torques based on the joint angle, velocities, and the GRF on the feet and crutch tips. The resulting exoskeleton model can directly generate joint torques from states in line with the RL framework. Finally, we empirically show that policy trained using our method can generate a gait with a 35% reduction in GRF with respect to the baseline.
翻译:动力下肢外骨骼使用者的代谢能量消耗主要来自上半身的发力,因为下半身被视为处于被动状态。然而,在设计运动控制器时,现有文献大多忽略了使用者的上半身发力。本研究采用深度强化学习方法,开发了一种能够最小化拐杖地面反作用力(GRF)的运动控制器。最小化GRF的核心理念在于减少使用者上半身的发力。据此,我们针对带拐杖的人-外骨骼系统设计了模型与学习框架。通过构建奖励函数,在满足物理机器人预定约束条件的前提下,推动人-外骨骼系统向前移动。我们采用最先进的深度强化学习方法——近端策略优化(PPO),在MuJoCo物理仿真器中针对不同超参数和网络架构进行了多轮评估。实验结果表明,我们的学习模型能够基于关节角度、角速度以及足部和拐杖尖端的GRF生成关节扭矩。采用该强化学习框架后,外骨骼模型可直接根据状态信息生成关节扭矩。最后,实证结果显示,相较于基准方法,采用本方法训练的策略可使GRF降低35%。