Humans have needs motivating their behavior according to intensity and context. However, we also create preferences associated with each action's perceived pleasure, which is susceptible to changes over time. This makes decision-making more complex, requiring learning to balance needs and preferences according to the context. To understand how this process works and enable the development of robots with a motivational-based learning model, we computationally model a motivation theory proposed by Hull. In this model, the agent (an abstraction of a mobile robot) is motivated to keep itself in a state of homeostasis. We added hedonic dimensions to see how preferences affect decision-making, and we employed reinforcement learning to train our motivated-based agents. We run three agents with energy decay rates representing different metabolisms in two different environments to see the impact on their strategy, movement, and behavior. The results show that the agent learned better strategies in the environment that enables choices more adequate according to its metabolism. The use of pleasure in the motivational mechanism significantly impacted behavior learning, mainly for slow metabolism agents. When survival is at risk, the agent ignores pleasure and equilibrium, hinting at how to behave in harsh scenarios.
翻译:人类的行为受需求和环境强度与情境的驱动,但个体还会基于感知愉悦度形成随时间变化的偏好。这使得决策过程更加复杂,要求学习主体能根据情境平衡需求与偏好。为理解这一机制并推动基于动机学习模型的机器人研发,我们对赫尔提出的动机理论进行了计算建模。在该模型中,智能体(移动机器人的抽象实体)被驱动维持内稳态,我们引入享乐维度以考察偏好对决策的影响,并采用强化学习训练动机驱动型智能体。通过设置三种不同能量衰减率(代表不同代谢类型)的智能体在两类环境中的实验,我们分析了代谢差异对策略、运动轨迹及行为模式的影响。结果表明:在更能适配代谢类型的环境中,智能体能习得更优策略;愉悦机制对行为学习的显著影响主要体现在慢代谢智能体中。当生存状态面临威胁时,智能体会优先忽略愉悦感与平衡态,这揭示了其在恶劣情境下的行为倾向特征。