This work introduces a novel training paradigm that draws from affective neuroscience. Inspired by the interplay of emotions and cognition in the human brain and more specifically the SEEKING motivational state, we design a dual-model framework where a smaller base model is trained continuously, while a larger motivated model is activated intermittently during predefined "motivation conditions". The framework mimics the emotional state of high curiosity and anticipation of reward in which broader brain regions are recruited to enhance cognitive performance. Exploiting scalable architectures where larger models extend smaller ones, our method enables shared weight updates and selective expansion of network capacity during noteworthy training steps. Empirical evaluation on the image classification task demonstrates that, not only does the alternating training scheme efficiently and effectively enhance the base model compared to a traditional scheme, in some cases, the motivational model also surpasses its standalone counterpart despite seeing less data per epoch. This opens the possibility of simultaneously training two models tailored to different deployment constraints with competitive or superior performance while keeping training cost lower than when training the larger model.
翻译:本研究提出一种受情感神经科学启发的新型训练范式。受人类大脑中情绪与认知相互作用(特别是SEEKING动机状态)的启发,我们设计了一个双模型框架:其中较小的基础模型持续训练,而较大的动机模型仅在预定义的"动机条件"下间歇性激活。该框架模拟了高好奇心与奖励预期的情感状态,在此状态下更广泛的大脑区域被调动以提升认知表现。通过利用可扩展架构(较大模型可扩展较小模型),我们的方法实现了权重更新的共享以及在关键训练步骤中选择性扩展网络容量。在图像分类任务上的实证评估表明:与传统训练方案相比,这种交替训练方案不仅能高效提升基础模型性能,在某些情况下,动机模型尽管每轮训练所见数据更少,其表现仍超越独立训练的对应模型。这为同时训练两个模型开辟了新可能:它们可分别适配不同的部署约束条件,在保持训练成本低于单独训练大模型的同时,获得具有竞争力或更优的性能表现。