We study offline multitask representation learning in reinforcement learning (RL), where a learner is provided with an offline dataset from different tasks that share a common representation and is asked to learn the shared representation. We theoretically investigate offline multitask low-rank RL, and propose a new algorithm called MORL for offline multitask representation learning. Furthermore, we examine downstream RL in reward-free, offline and online scenarios, where a new task is introduced to the agent that shares the same representation as the upstream offline tasks. Our theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.
翻译:我们研究强化学习中的离线多任务表示学习问题,其中学习者获得来自不同任务的离线数据集,这些任务共享一个共同的表示,并要求学习该共享表示。我们从理论上研究了离线多任务低秩强化学习,并提出了一种名为MORL的新算法用于离线多任务表示学习。此外,我们考察了在无奖励、离线和在线场景下的下游强化学习,其中向智能体引入了一个与上游离线任务共享相同表示的新任务。我们的理论结果表明,使用从上游离线任务中学到的表示,相较于直接学习低秩模型的表示,具有显著优势。