Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning

Being able to harness the power of large datasets for developing cooperative multi-agent controllers promises to unlock enormous value for real-world applications. Many important industrial systems are multi-agent in nature and are difficult to model using bespoke simulators. However, in industry, distributed processes can often be recorded during operation, and large quantities of demonstrative data stored. Offline multi-agent reinforcement learning (MARL) provides a promising paradigm for building effective decentralised controllers from such datasets. However, offline MARL is still in its infancy and therefore lacks standardised benchmark datasets and baselines typically found in more mature subfields of reinforcement learning (RL). These deficiencies make it difficult for the community to sensibly measure progress. In this work, we aim to fill this gap by releasing off-the-grid MARL (OG-MARL): a growing repository of high-quality datasets with baselines for cooperative offline MARL research. Our datasets provide settings that are characteristic of real-world systems, including complex environment dynamics, heterogeneous agents, non-stationarity, many agents, partial observability, suboptimality, sparse rewards and demonstrated coordination. For each setting, we provide a range of different dataset types (e.g. Good, Medium, Poor, and Replay) and profile the composition of experiences for each dataset. We hope that OG-MARL will serve the community as a reliable source of datasets and help drive progress, while also providing an accessible entry point for researchers new to the field.

翻译：利用大型数据集开发协作多智能体控制器，有望为实际应用释放巨大价值。许多重要工业系统具有多智能体特性，且难以使用定制模拟器建模。然而，在工业场景中，分布式过程运行期间通常可记录数据，并存储大量示范性数据。离线多智能体强化学习（MARL）提供了一种有前景的范式，可从此类数据集构建有效的分布式控制器。然而，离线MARL仍处于发展初期，因此缺乏强化学习（RL）更成熟子领域常见的标准化基准数据集与基线。这些不足使得社区难以合理衡量研究进展。本文旨在通过发布离网MARL（OG-MARL）填补这一空白：这是一个持续增长的高质量数据集仓库，为合作离线MARL研究提供基线。我们的数据集涵盖了实际系统的典型特征，包括复杂环境动态、异质智能体、非平稳性、大规模智能体、部分可观测性、次优性、稀疏奖励及示范性协调。针对每个场景，我们提供不同类型的数据集（如优秀、中等、较差及回放），并分析每个数据集的经验构成。我们希望OG-MARL能作为可靠的数据集来源服务社区，推动研究进展，同时为新入行的研究人员提供易于入门的切入点。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日