Training resource-constrained autonomous agents on multiple tasks simultaneously is crucial for adapting to diverse real-world environments. Recent works employ reinforcement learning (RL) approach, but they still suffer from sub-optimal multi-task performance due to task interference. State-of-the-art works employ Spiking Neural Networks (SNNs) to improve RL-based multi-task learning and enable low-power/energy operations through network enhancements and spike-driven data stream processing. However, they rely on fixed task-switching intervals during its training, thus limiting its performance and scalability. To address this, we propose SwitchMT, a novel methodology that employs adaptive task-switching for effective, scalable, and simultaneous multi-task learning. SwitchMT employs the following key ideas: (1) leveraging a Deep Spiking Q-Network with active dendrites and dueling structure, that utilizes task-specific context signals to create specialized sub-networks; and (2) devising an adaptive task-switching policy that leverages both rewards and internal dynamics of the network parameters. Experimental results demonstrate that SwitchMT achieves competitive scores in multiple Atari games (i.e., Pong: -8.8, Breakout: 5.6, and Enduro: 355.2) and longer game episodes as compared to the state-of-the-art. These results also highlight the effectiveness of SwitchMT methodology in addressing task interference without increasing the network complexity, enabling intelligent autonomous agents with scalable multi-task learning capabilities.
翻译:训练资源受限的自主体同时处理多任务对于适应多样化真实环境至关重要。现有研究采用强化学习方法,但因任务干扰导致多任务性能欠佳。最新工作通过引入脉冲神经网络,结合网络增强与脉冲驱动数据流处理,在提升强化学习多任务性能的同时实现低功耗运行。然而这些方法在训练过程中依赖固定任务切换间隔,限制了性能与可扩展性。为此,本文提出SwitchMT——一种采用自适应任务切换的新型方法,实现高效、可扩展的多任务同步学习。SwitchMT的核心创新包括:(1)构建具有活跃树突与决斗结构的深度脉冲Q网络,利用任务特定上下文信号生成专用子网络;(2)设计融合奖励信号与网络参数内部动态的自适应任务切换策略。实验结果表明,SwitchMT在多项Atari游戏中取得具有竞争力的得分(如Pong: -8.8、Breakout: 5.6、Enduro: 355.2),且游戏回合持续时间显著优于现有方法。这些结果充分验证了SwitchMT在不增加网络复杂度前提下解决任务干扰的有效性,为智能自主体赋予可扩展的多任务学习能力。