Spatial task allocation in systems such as multi-robot delivery or ride-sharing requires balancing efficiency with fair service across tasks. Greedy assignment policies that match each agent to its highest-preference or lowest-cost task can maximize efficiency but often create inequities: some tasks receive disproportionately favorable service (e.g., shorter delays or better matches), while others face long waits or poor allocations. We study fairness in heterogeneous multi-agent systems where tasks vary in preference alignment and urgency. Most existing approaches either assume centralized coordination or largely ignore fairness under partial observability. Distinct from this prior work, we establish a connection between the Eisenberg-Gale (EG) equilibrium convex program and decentralized, partially observable multi-agent learning. Building on this connection, we develop two equilibrium-informed algorithms that integrate fairness and efficiency: (i) a multi-agent reinforcement learning (MARL) framework, EG-MARL, whose training is guided by a centralized EG equilibrium assignment algorithm; and (ii) a stochastic online optimization mechanism that performs guided exploration and subset-based fair assignment as tasks are discovered. We evaluate on Multi-Agent Particle Environment (MPE) simulations across varying team sizes against centralized EG, Hungarian, and Min-Max distance baselines, and also present a Webots-based warehouse proof-of-concept with heterogeneous robots. Both methods preserve the fairness-efficiency balance of the EG solution under partial observability, with EG-MARL achieving near-centralized coordination and reduced travel distances, and the online mechanism enabling real-time allocation with competitive fairness.
翻译:在多机器人配送或网约车等系统中,空间任务分配需要在任务间平衡效率与公平服务。将每个智能体匹配至其最高偏好或最低成本任务的贪婪分配策略虽能最大化效率,却常导致不公平现象:部分任务获得不成比例的优越服务(如更短延迟或更佳匹配),而其他任务则面临长时间等待或低质量分配。本研究针对任务在偏好匹配度和紧急性上存在差异的异构多智能体系统,探讨其公平性问题。现有方法大多假设集中式协调,或在部分可观测性条件下基本忽略公平性。不同于以往研究,我们建立了艾森伯格-盖尔(EG)均衡凸规划与去中心化、部分可观测的多智能体学习之间的理论联系。基于此联系,我们开发了两种融合公平与效率的均衡引导算法:(1)以集中式EG均衡分配算法指导训练的多智能体强化学习框架EG-MARL;(2)在任务发现过程中执行引导探索与基于子集的公平分配的随机在线优化机制。我们在多智能体粒子环境(MPE)仿真中,针对不同团队规模,与集中式EG算法、匈牙利算法及最小最大距离基线进行比较评估,并基于Webots平台提出异构机器人仓库的概念验证。两种方法在部分可观测条件下均保持了EG解的公平-效率平衡:EG-MARL实现了接近集中式协调的效果并降低了移动距离,而在线机制能以具有竞争力的公平性实现实时分配。