Is there a canonical way to think of agency beyond reward maximisation? In this paper, we show that any type of behaviour complying with physically sound assumptions about how macroscopic biological agents interact with the world canonically integrates exploration and exploitation in the sense of minimising risk and ambiguity about states of the world. This description, known as active inference, refines the free energy principle, a popular descriptive framework for action and perception originating in neuroscience. Active inference provides a normative Bayesian framework to simulate and model agency that is widely used in behavioural neuroscience, reinforcement learning (RL) and robotics. The usefulness of active inference for RL is three-fold. \emph{a}) Active inference provides a principled solution to the exploration-exploitation dilemma that usefully simulates biological agency. \emph{b}) It provides an explainable recipe to simulate behaviour, whence behaviour follows as an explainable mixture of exploration and exploitation under a generative world model, and all differences in behaviour are explicit in differences in world model. \emph{c}) This framework is universal in the sense that it is theoretically possible to rewrite any RL algorithm conforming to the descriptive assumptions of active inference as an active inference algorithm. Thus, active inference can be used as a tool to uncover and compare the commitments and assumptions of more specific models of agency.
翻译:是否存在一种超越奖励最大化的、思考能动性的规范方式?在本文中,我们证明,任何符合关于宏观生物主体如何与物理世界互动的合理假设的行为,都能以规范方式整合探索与利用,其意义在于最小化关于世界状态的风险与歧义。这种描述被称为主动推理,它精炼了自由能原理——一个起源于神经科学、关于行动与感知的流行描述性框架。主动推理提供了用于模拟和建模能动性的规范贝叶斯框架,广泛应用于行为神经科学、强化学习(RL)和机器人学。主动推理对RL的实用性体现在三个方面:a) 主动推理为探索-利用困境提供了原则性解决方案,能有效模拟生物能动性;b) 它提供了一种可解释的模拟行为方法,行为可解释为在生成世界模型下探索与利用的混合,且所有行为差异明确体现于世界模型的差异;c) 该框架具有普遍性,即理论上任何符合主动推理描述性假设的RL算法都可改写成主动推理算法。因此,主动推理可用作揭示和比较更具体能动性模型的承诺与假设的工具。