How to behave efficiently and flexibly is a central problem for understanding biological agents and creating intelligent embodied AI. It has been well known that behavior can be classified as two types: reward-maximizing habitual behavior, which is fast while inflexible; and goal-directed behavior, which is flexible while slow. Conventionally, habitual and goal-directed behaviors are considered handled by two distinct systems in the brain. Here, we propose to bridge the gap between the two behaviors, drawing on the principles of variational Bayesian theory. We incorporate both behaviors in one framework by introducing a Bayesian latent variable called "intention". The habitual behavior is generated by using prior distribution of intention, which is goal-less; and the goal-directed behavior is generated by the posterior distribution of intention, which is conditioned on the goal. Building on this idea, we present a novel Bayesian framework for modeling behaviors. Our proposed framework enables skill sharing between the two kinds of behaviors, and by leveraging the idea of predictive coding, it enables an agent to seamlessly generalize from habitual to goal-directed behavior without requiring additional training. The proposed framework suggests a fresh perspective for cognitive science and embodied AI, highlighting the potential for greater integration between habitual and goal-directed behaviors.
翻译:如何高效且灵活地执行行为,是理解生物智能体以及创建智能具身AI的核心问题。众所周知,行为可分为两类:追求奖励最大化的习惯性行为,其特点是快速但缺乏灵活性;以及目标导向行为,其特点是灵活但速度较慢。传统上,习惯性和目标导向行为被认为由大脑中的两个不同系统处理。在此,我们基于变分贝叶斯理论原理,提出弥合这两种行为之间的鸿沟。我们通过引入一个名为"意图"的贝叶斯潜变量,将两种行为纳入同一框架。习惯性行为通过意图的先验分布生成,该分布无目标性;而目标导向行为则通过意图的后验分布生成,该分布以目标为条件。基于这一思想,我们提出了一种新颖的贝叶斯框架用于行为建模。该框架支持两类行为之间的技能共享,并借助预测编码的思想,使智能体能够无缝地从习惯性行为泛化到目标导向行为,而无需额外训练。所提出的框架为认知科学与具身AI提供了新视角,凸显了习惯性与目标导向行为之间更深度整合的潜力。