The foraging behavior of animals is a paradigm of target search in nature. Understanding which foraging strategies are optimal and how animals learn them are central challenges in modeling animal foraging. While the question of optimality has wide-ranging implications across fields such as economy, physics, and ecology, the question of learnability is a topic of ongoing debate in evolutionary biology. Recognizing the interconnected nature of these challenges, this work addresses them simultaneously by exploring optimal foraging strategies through a reinforcement learning framework. To this end, we model foragers as learning agents. We first prove theoretically that maximizing rewards in our reinforcement learning model is equivalent to optimizing foraging efficiency. We then show with numerical experiments that, in the paradigmatic model of non-destructive search, our agents learn foraging strategies which outperform the efficiency of some of the best known strategies such as L\'evy walks. These findings highlight the potential of reinforcement learning as a versatile framework not only for optimizing search strategies but also to model the learning process, thus shedding light on the role of learning in natural optimization processes.
翻译:动物的觅食行为是自然界中目标搜索的典型范式。理解哪些觅食策略最优以及动物如何学习这些策略,是动物觅食行为建模的核心挑战。尽管最优性问题在经济学、物理学和生态学等领域具有广泛影响,但可学习性问题仍是演化生物学中持续争论的课题。认识到这些挑战的相互关联性,本研究通过强化学习框架同时探索最优觅食策略。为此,我们将觅食者建模为学习智能体。首先从理论上证明,强化学习模型中的奖励最大化等价于觅食效率优化。随后通过数值实验表明,在非破坏性搜索的典型模型中,我们的智能体能够学习到超越莱维行走等已知最优策略效率的觅食策略。这些发现凸显了强化学习作为一种通用框架的潜力,既能优化搜索策略,亦可建模学习过程,从而揭示学习在自然优化过程中的作用。