Visual Reinforcement Learning (Visual RL), coupled with high-dimensional observations, has consistently confronted the long-standing challenge of out-of-distribution generalization. Despite the focus on algorithms aimed at resolving visual generalization problems, we argue that the devil is in the existing benchmarks as they are restricted to isolated tasks and generalization categories, undermining a comprehensive evaluation of agents' visual generalization capabilities. To bridge this gap, we introduce RL-ViGen: a novel Reinforcement Learning Benchmark for Visual Generalization, which contains diverse tasks and a wide spectrum of generalization types, thereby facilitating the derivation of more reliable conclusions. Furthermore, RL-ViGen incorporates the latest generalization visual RL algorithms into a unified framework, under which the experiment results indicate that no single existing algorithm has prevailed universally across tasks. Our aspiration is that RL-ViGen will serve as a catalyst in this area, and lay a foundation for the future creation of universal visual generalization RL agents suitable for real-world scenarios. Access to our code and implemented algorithms is provided at https://gemcollector.github.io/RL-ViGen/.
翻译:视觉强化学习(Visual RL)结合高维观测数据时,始终面临分布外泛化这一长期挑战。尽管现有研究聚焦于解决视觉泛化问题的算法,但我们认为症结在于现有基准测试——它们局限于孤立任务和单一泛化类别,难以全面评估智能体的视觉泛化能力。为弥合这一差距,我们提出RL-ViGen:一个新颖的面向视觉泛化的强化学习基准,包含多样化的任务和广泛类型的泛化场景,从而有助于得出更可靠的结论。此外,RL-ViGen将最新的视觉泛化强化学习算法整合至统一框架中,实验结果表明,尚无单一现有算法能在所有任务上普遍表现优异。我们期望RL-ViGen能成为该领域的催化剂,为未来创建适用于真实场景的通用视觉泛化强化学习智能体奠定基础。相关代码与算法实现请访问 https://gemcollector.github.io/RL-ViGen/。