Factorised Active Inference for Strategic Multi-Agent Interactions

Understanding how individual agents make strategic decisions within collectives is important for advancing fields as diverse as economics, neuroscience, and multi-agent systems. Two complementary approaches can be integrated to this end. The Active Inference framework (AIF) describes how agents employ a generative model to adapt their beliefs about and behaviour within their environment. Game theory formalises strategic interactions between agents with potentially competing objectives. To bridge the gap between the two, we propose a factorisation of the generative model whereby each agent maintains explicit, individual-level beliefs about the internal states of other agents, and uses them for strategic planning in a joint context. We apply our model to iterated general-sum games with 2 and 3 players, and study the ensemble effects of game transitions, where the agents' preferences (game payoffs) change over time. This non-stationarity, beyond that caused by reciprocal adaptation, reflects a more naturalistic environment in which agents need to adapt to changing social contexts. Finally, we present a dynamical analysis of key AIF quantities: the variational free energy (VFE) and the expected free energy (EFE) from numerical simulation data. The ensemble-level EFE allows us to characterise the basins of attraction of games with multiple Nash Equilibria under different conditions, and we find that it is not necessarily minimised at the aggregate level. By integrating AIF and game theory, we can gain deeper insights into how intelligent collectives emerge, learn, and optimise their actions in dynamic environments, both cooperative and non-cooperative.

翻译：理解个体智能体如何在集体中做出策略性决策，对于推动经济学、神经科学和多智能体系统等多个领域的发展至关重要。为此，可以整合两种互补的方法。主动推理框架描述了智能体如何利用生成模型来调整其对环境的信念及行为。博弈论则形式化了具有潜在竞争目标的智能体之间的策略性交互。为弥合两者之间的差距，我们提出一种生成模型的因子分解方法，其中每个智能体对其他智能体的内部状态保持显式的个体层面信念，并在联合环境中利用这些信念进行策略规划。我们将模型应用于具有2名和3名参与者的迭代一般和博弈，并研究博弈转换的总体效应——即智能体的偏好（博弈收益）随时间变化的情况。这种超越相互适应引起的非平稳性，反映了一个更为自然的环境，其中智能体需要适应不断变化的社会情境。最后，我们对主动推理的关键量——变分自由能和预期自由能——进行了基于数值模拟数据的动力学分析。总体层面的预期自由能使我们能够刻画具有多个纳什均衡的博弈在不同条件下的吸引域特征，并发现其在聚合层面未必最小化。通过整合主动推理与博弈论，我们能够更深入地理解智能集体如何在动态环境（包括合作与非合作情境）中涌现、学习并优化其行为。