L\'evy walks and other theoretical models of optimal foraging have been successfully used to describe real-world scenarios, attracting attention in several fields such as economy, physics, ecology, and evolutionary biology. However, it remains unclear in most cases which strategies maximize foraging efficiency and whether such strategies can be learned by living organisms. To address these questions, we model foragers as reinforcement learning agents. We first prove theoretically that maximizing rewards in our reinforcement learning model is equivalent to optimizing foraging efficiency. We then show with numerical experiments that our agents learn foraging strategies which outperform the efficiency of known strategies such as L\'evy walks.
翻译:莱维行走及其他理论最优觅食模型已成功应用于描述现实场景,吸引了经济学、物理学、生态学和进化生物学等多个领域的关注。然而,在大多数情况下,哪些策略能最大化觅食效率,以及生物体能否习得这些策略,仍不明确。为解决这些问题,我们将觅食者建模为强化学习智能体。我们首先从理论上证明,在强化学习模型中最大化奖励等价于优化觅食效率。随后通过数值实验表明,我们的智能体能够习得优于莱维行走等已知策略的觅食策略。