Many challenging tasks such as managing traffic systems, electricity grids, or supply chains involve complex decision-making processes that must balance multiple conflicting objectives and coordinate the actions of various independent decision-makers (DMs). One perspective for formalising and addressing such tasks is multi-objective multi-agent reinforcement learning (MOMARL). MOMARL broadens reinforcement learning (RL) to problems with multiple agents each needing to consider multiple objectives in their learning process. In reinforcement learning research, benchmarks are crucial in facilitating progress, evaluation, and reproducibility. The significance of benchmarks is underscored by the existence of numerous benchmark frameworks developed for various RL paradigms, including single-agent RL (e.g., Gymnasium), multi-agent RL (e.g., PettingZoo), and single-agent multi-objective RL (e.g., MO-Gymnasium). To support the advancement of the MOMARL field, we introduce MOMAland, the first collection of standardised environments for multi-objective multi-agent reinforcement learning. MOMAland addresses the need for comprehensive benchmarking in this emerging field, offering over 10 diverse environments that vary in the number of agents, state representations, reward structures, and utility considerations. To provide strong baselines for future research, MOMAland also includes algorithms capable of learning policies in such settings.
翻译:许多具有挑战性的任务,如交通系统管理、电网调度或供应链优化,都涉及复杂的决策过程,这些过程必须平衡多个相互冲突的目标,并协调多个独立决策者的行为。形式化并解决此类任务的一种视角是多目标多智能体强化学习。MOMARL将强化学习扩展到多智能体场景,其中每个智能体在学习过程中都需要考虑多个目标。在强化学习研究中,基准测试对于推动进展、评估性能和确保可复现性至关重要。基准测试的重要性体现在已为各种强化学习范式开发了众多基准框架,包括单智能体强化学习(如Gymnasium)、多智能体强化学习(如PettingZoo)以及单智能体多目标强化学习(如MO-Gymnasium)。为支持MOMARL领域的发展,我们推出了MOMAland,这是首个用于多目标多智能体强化学习的标准化环境集合。MOMAland满足了这一新兴领域对全面基准测试的需求,提供了超过10个多样化环境,这些环境在智能体数量、状态表示、奖励结构和效用考量方面各不相同。为了为未来研究提供坚实的基线,MOMAland还包含了能够在此类设置中学习策略的算法。