This paper aims to develop a new human-machine interface to improve rehabilitation performance from the perspective of both the user (patient) and the machine (robot) by introducing the co-adaption techniques via model-based reinforcement learning. Previous studies focus more on robot assistance, i.e., to improve the control strategy so as to fulfill the objective of Assist-As-Needed. In this study, we treat the full process of robot-assisted rehabilitation as a co-adaptive or mutual learning process and emphasize the adaptation of the user to the machine. To this end, we proposed a Co-adaptive MDPs (CaMDPs) model to quantify the learning rates based on cooperative multi-agent reinforcement learning (MARL) in the high abstraction layer of the systems. We proposed several approaches to cooperatively adjust the Policy Improvement among the two agents in the framework of Policy Iteration. Based on the proposed co-adaptive MDPs, the simulation study indicates the non-stationary problem can be mitigated using various proposed Policy Improvement approaches.
翻译:本文旨在通过引入基于模型的强化学习共适应技术,从用户(患者)与机器(机器人)双重视角出发,开发一种新型人机接口以提升康复训练效果。既往研究更侧重机器人辅助策略的优化,即通过改进控制方法实现"按需辅助"目标。本研究将机器人辅助康复全过程视为共适应或相互学习过程,特别强调用户对机器的适应性。为此,我们提出了共适应马尔可夫决策过程(CaMDPs)模型,基于系统高层抽象层的合作式多智能体强化学习(MARL)量化学习速率。在策略迭代框架下,我们提出了多种协同调整双智能体策略改进的方法。基于所提出的共适应MDPs,仿真研究表明,通过采用不同的策略改进方法可有效缓解非平稳性问题。