Visual Reinforcement Learning (RL) agents trained on limited views face significant challenges in generalizing their learned abilities to unseen views. This inherent difficulty is known as the problem of $\textit{view generalization}$. In this work, we systematically categorize this fundamental problem into four distinct and highly challenging scenarios that closely resemble real-world situations. Subsequently, we propose a straightforward yet effective approach to enable successful adaptation of visual $\textbf{Mo}$del-based policies for $\textbf{Vie}$w generalization ($\textbf{MoVie}$) during test time, without any need for explicit reward signals and any modification during training time. Our method demonstrates substantial advancements across all four scenarios encompassing a total of $\textbf{18}$ tasks sourced from DMControl, xArm, and Adroit, with a relative improvement of $\mathbf{33}$%, $\mathbf{86}$%, and $\mathbf{152}$% respectively. The superior results highlight the immense potential of our approach for real-world robotics applications. Videos are available at https://yangsizhe.github.io/MoVie/ .
翻译:视觉强化学习代理在有限视角下训练后,难以将学到的能力泛化至未见过的视角,这一固有挑战被称为“视角泛化”问题。本研究系统地将该基础问题划分为四种贴近真实场景的截然不同且极具挑战性的场景。随后,我们提出一种简洁有效的方案,使基于视觉的模型化策略能够在测试阶段成功适应视角变化(即MoVie),在无需显式奖励信号或修改训练过程的前提下实现泛化。我们的方法在涵盖DMControl、xArm和Adroit的**18**个任务的所有四种场景中均取得显著进步,相对提升分别达到**33**%、**86**%和**152**%。这一卓越结果表明我们的方法在真实机器人应用中具有巨大潜力。视频见https://yangsizhe.github.io/MoVie/。