Reliable real-world deployment of reinforcement learning (RL) methods requires a nuanced understanding of their strengths and weaknesses and how they compare to those of humans. Human-machine systems are becoming more prevalent and the design of these systems relies on a task-oriented understanding of both human learning (HL) and RL. Thus, an important line of research is characterizing how the structure of a learning task affects learning performance. While increasingly complex benchmark environments have led to improved RL capabilities, such environments are difficult to use for the dedicated study of task structure. To address this challenge we present a learning environment built to support rigorous study of the impact of task structure on HL and RL. We demonstrate the environment's utility for such study through example experiments in task structure that show performance differences between humans and RL algorithms.
翻译:强化学习(RL)方法在实际场景中的可靠部署,需要对其优势与局限性以及与人类学习能力的差异具备细致认知。人机系统正日益普及,其设计依赖于对人类学习(HL)与强化学习面向任务特性的理解。因此,揭示学习任务结构如何影响学习表现成为重要的研究方向。尽管日益复杂的基准测试环境提升了强化学习的能力,但这些环境难以专门用于任务结构的深入研究。针对这一挑战,我们构建了一个支持严格分析任务结构对人类学习与强化学习影响的学习环境。通过展示任务结构差异导致人类与强化学习算法性能差异的示例实验,验证了该环境对此类研究的实用价值。