This paper introduces ARCLE, an environment designed to facilitate reinforcement learning research on the Abstraction and Reasoning Corpus (ARC). Addressing this inductive reasoning benchmark with reinforcement learning presents these challenges: a vast action space, a hard-to-reach goal, and a variety of tasks. We demonstrate that an agent with proximal policy optimization can learn individual tasks through ARCLE. The adoption of non-factorial policies and auxiliary losses led to performance enhancements, effectively mitigating issues associated with action spaces and goal attainment. Based on these insights, we propose several research directions and motivations for using ARCLE, including MAML, GFlowNets, and World Models.
翻译:本文介绍了ARCLE,这是一个旨在促进针对抽象与推理语料库(ARC)进行强化学习研究的环境。使用强化学习应对这一归纳推理基准面临以下挑战:巨大的动作空间、难以达成的目标以及多样化的任务。我们证明,采用近端策略优化的智能体能够通过ARCLE学习单个任务。采用非分解式策略与辅助损失函数带来了性能提升,有效缓解了与动作空间和目标达成相关的问题。基于这些发现,我们提出了使用ARCLE的若干研究方向和动机,包括MAML、GFlowNets和World Models。