Our goals fundamentally shape how we experience the world. For example, when we are hungry, we tend to view objects in our environment according to whether or not they are edible (or tasty). Alternatively, when we are cold, we may view the very same objects according to their ability to produce heat. Computational theories of learning in cognitive systems, such as reinforcement learning, use the notion of "state-representation" to describe how agents decide which features of their environment are behaviorally-relevant and which can be ignored. However, these approaches typically assume "ground-truth" state representations that are known by the agent, and reward functions that need to be learned. Here we suggest an alternative approach in which state-representations are not assumed veridical, or even pre-defined, but rather emerge from the agent's goals through interaction with its environment. We illustrate this novel perspective by inferring the goals driving rat behavior in an odor-guided choice task and discuss its implications for developing, from first principles, an information-theoretic account of goal-directed state representation learning and behavior.
翻译:我们的目标从根本上塑造了我们体验世界的方式。例如,饥饿时,我们倾向于根据物体是否可食用(或美味)来观察环境中的物体;而寒冷时,我们可能根据同一物体产生热量的能力来审视它们。认知系统中的学习计算理论(如强化学习)使用“状态表征”这一概念来描述智能体如何决定环境中的哪些特征与行为相关、哪些可以忽略。然而,这些方法通常假设智能体已知“真实”状态表征,且奖励函数需要被学习。在这里,我们提出另一种方法:状态表征不被视为真实的甚至预定义的,而是在智能体与环境的交互中从其目标中涌现。我们通过推断气味引导选择任务中大鼠行为的目标来阐释这一新颖视角,并讨论其意义——从第一性原理出发,构建目标导向状态表征学习与行为的信息论解释。