Reinforcement learning (RL) has helped improve decision-making in several applications. However, applying traditional RL is challenging in some applications, such as rehabilitation of people with a spinal cord injury (SCI). Among other factors, using RL in this domain is difficult because there are many possible treatments (i.e., large action space) and few patients (i.e., limited training data). Treatments for SCIs have natural groupings, so we propose two approaches to grouping treatments so that an RL agent can learn effectively from limited data. One relies on domain knowledge of SCI rehabilitation and the other learns similarities among treatments using an embedding technique. We then use Fitted Q Iteration to train an agent that learns optimal treatments. Through a simulation study designed to reflect the properties of SCI rehabilitation, we find that both methods can help improve the treatment decisions of physiotherapists, but the approach based on domain knowledge offers better performance. Our findings provide a "proof of concept" that RL can be used to help improve the treatment of those with an SCI and indicates that continued efforts to gather data and apply RL to this domain are worthwhile.
翻译:强化学习(RL)已在多个应用中提升了决策性能。然而,在脊髓损伤(SCI)患者康复等特定应用中,传统强化学习方法的应用面临挑战。该领域应用强化学习的难点包括:治疗手段众多(即大规模动作空间)且患者数据有限(即训练样本稀缺)。鉴于SCI治疗手段具有天然分组特性,我们提出两种治疗方案分组方法,使强化学习智能体能够从有限数据中有效学习。一种方法基于SCI康复的领域知识,另一种则采用嵌入技术自动学习治疗手段间的相似性。随后,我们利用拟Q迭代法训练智能体学习最优治疗方案。通过模拟反映SCI康复特性的仿真研究,发现两种方法均能有效改善物理治疗师的治疗决策,但基于领域知识的方法表现出更优性能。本研究证明了"概念验证"的可行性——强化学习可助力改善SCI患者的治疗,并表明持续在该领域收集数据并应用强化学习技术具有重要价值。