Assistive agents should make humans' lives easier. Classically, such assistance is studied through the lens of inverse reinforcement learning, where an assistive agent (e.g., a chatbot, a robot) infers a human's intention and then selects actions to help the human reach that goal. This approach requires inferring intentions, which can be difficult in high-dimensional settings. We build upon prior work that studies assistance through the lens of empowerment: an assistive agent aims to maximize the influence of the human's actions such that they exert a greater control over the environmental outcomes and can solve tasks in fewer steps. We lift the major limitation of prior work in this area--scalability to high-dimensional settings--with contrastive successor representations. We formally prove that these representations estimate a similar notion of empowerment to that studied by prior work and provide a ready-made mechanism for optimizing it. Empirically, our proposed method outperforms prior methods on synthetic benchmarks, and scales to Overcooked, a cooperative game setting. Theoretically, our work connects ideas from information theory, neuroscience, and reinforcement learning, and charts a path for representations to play a critical role in solving assistive problems.
翻译:辅助性智能体应使人类生活更便捷。经典方法通过逆强化学习视角研究此类辅助,即辅助性智能体(如聊天机器人、机器人)推断人类意图后选择行动以协助人类达成目标。该方法需进行意图推断,在高维场景中可能面临困难。我们基于先前通过赋能视角研究辅助的工作展开:辅助性智能体旨在最大化人类行动的影响力,使其能更有效地控制环境结果并以更少步骤完成任务。我们通过对比后继表征突破了该领域先前工作的主要局限——高维场景的可扩展性。我们严格证明这些表征能估算与先前研究相似的赋能概念,并提供现成的优化机制。实证表明,所提方法在合成基准测试中优于现有方法,并可扩展至合作游戏场景Overcooked。理论上,我们的工作融合了信息论、神经科学与强化学习的思想,为表征在解决辅助性问题中发挥关键作用指明了路径。