Cooperative multi-agent reinforcement learning (MARL) under sparse rewards presents a fundamental challenge due to limited exploration and insufficient coordinated attention among agents. In this work, we propose the Focusing Influence Mechanism (FIM), a novel framework that enhances cooperation by directing agent influence toward task-critical elements, referred to as Center of Gravity (CoG) state dimensions, inspired by Clausewitz's military theory. FIM consists of three core components: (1) identifying CoG state dimensions based on their stability under agent behavior, (2) designing counterfactual intrinsic rewards to promote meaningful influence on these dimensions, and (3) encouraging persistent and synchronized focus through eligibility-trace-based credit accumulation. These mechanisms enable agents to induce more targeted and effective state transitions, facilitating robust cooperation even in extremely sparse reward settings. Empirical evaluations across diverse MARL benchmarks demonstrate that the proposed FIM significantly improves cooperative performance compared to baselines.
翻译:稀疏奖励下的协作式多智能体强化学习(MARL)因探索受限与智能体间协调关注不足而面临根本性挑战。本文提出聚焦影响力机制(FIM),这一新颖框架受克劳塞维茨军事理论启发,通过将智能体的影响力导向任务关键要素(称为重心状态维度)来增强协作。FIM包含三个核心组件:(1)基于智能体行为下的稳定性识别重心状态维度;(2)设计反事实内在奖励以促进对这些维度的有效影响;(3)通过基于资格迹的信用积累机制鼓励持续且同步的聚焦。这些机制使智能体能够引发更具针对性且有效的状态转移,即使在极端稀疏奖励环境下也能实现稳健协作。在多种MARL基准测试中的实证评估表明,所提出的FIM相较于基线方法显著提升了协作性能。