Active Inference Tree Search in Large POMDPs

The ability to plan ahead efficiently is key for both living organisms and artificial systems. Model-based planning and prospection are widely studied in cognitive neuroscience and artificial intelligence (AI), but from different perspectives--and with different desiderata in mind (biological realism versus scalability) that are difficult to reconcile. Here, we introduce a novel method to plan in POMDPs--Active Inference Tree Search (AcT)--that combines the normative character and biological realism of a leading planning theory in neuroscience (Active Inference) and the scalability of tree search methods in AI. This unification enhances both approaches. On the one hand, tree searches enable the biologically grounded, first principle method of active inference to be applied to large-scale problems. On the other hand, active inference provides a principled solution to the exploration-exploitation dilemma, which is often addressed heuristically in tree search methods. Our simulations show that AcT successfully navigates binary trees that are challenging for sampling-based methods, problems that require adaptive exploration, and the large POMDP problem 'RockSample'--in which AcT reproduces state-of-the-art POMDP solutions. Furthermore, we illustrate how AcT can be used to simulate neurophysiological responses (e.g., in the hippocampus and prefrontal cortex) of humans and other animals that solve large planning problems. These numerical analyses show that Active Tree Search is a principled realisation of neuroscientific and AI planning theories, which offer both biological realism and scalability.

翻译：高效的前瞻规划能力对于生物有机体和人工智能系统都至关重要。基于模型的规划与预测在认知神经科学和人工智能（AI）领域被广泛研究，但两者视角迥异——且各自追求难以调和的目标（生物真实性与可扩展性）。本文提出一种在部分可观测马尔可夫决策过程（POMDP）中进行规划的新方法——主动推理树搜索（Active Inference Tree Search, AcT）——该方法融合了神经科学中主流规划理论（主动推理）的规范性与生物真实性，以及AI中树搜索方法的可扩展性。这种统一使得两种方法都得到增强：一方面，树搜索使得源自第一性原理且具有生物学基础的主动推理方法能够应用于大规模问题；另一方面，主动推理为探索-利用困境提供了原则性解决方案，而树搜索方法通常以启发式方式处理该困境。我们的模拟实验表明，AcT成功应对了采样方法难以处理的二叉树导航问题、需要自适应探索的问题，以及大规模POMDP问题"RockSample"——在该问题中AcT重现了当前最优的POMDP解决方案。此外，我们展示了如何利用AcT模拟人类及其他动物在解决大规模规划问题时产生的神经生理反应（例如在海马体与前额叶皮层中的活动）。这些数值分析表明，主动树搜索是神经科学与AI规划理论的原则性实现，兼具生物真实性与可扩展性。