Distributed online learning in Internet of Things(IoT)-enabled multi-agent systems(MASs) is highly vulnerable to persistent adversarial interactions, particularly when malicious agents cannot be fully isolated during the transient learning stage. Existing resilient learning methods mainly focus on convergence preservation or malicious suppression, while the resulting evolution inefficiency caused by repeated corrective adaptation remains largely unexplored. To address this issue, this paper develops a cost-aware distributed online learning framework with a strict rejection behavior against adversarial agents. The proposed mechanism suppresses harmful assimilation of suspicious neighboring information and reveals a previously overlooked side effect, that is, the strict rejection may induce heterogeneous transient evolution among neighboring normal agents, leading to evolution desynchronization across the network. To mitigate this effect, a two-time-scale adaptive evolution regulation architecture is further developed, in which the outer layer dynamically adjusts the long-term evolution-rate schedule while the inner layer preserves robust online learning. Theoretical analysis establishes the dynamic tracking property of the outer-layer update and proves that the proposed regulation mechanism attenuates the propagation of strict-rejection-induced evolution desynchronization. Numerical simulations and a satellite-assisted IoT monitoring scenario demonstrate that the proposed method achieves robust and low-cost distributed online learning under persistent malicious interference.
翻译:在物联网赋能的多智能体系统中,分布式在线学习极易受到持续性对抗交互的影响,特别是在瞬态学习阶段无法完全隔离恶意智能体的情况下。现有弹性学习方法主要聚焦于收敛性保持或恶意抑制,但由反复纠正性适应导致的演化效率损失问题尚未得到充分探索。针对该问题,本文提出一种具有严格拒绝行为的代价感知分布式在线学习框架。所提机制可抑制可疑邻居信息的有害同化,并揭示了一个此前被忽视的副作用:严格拒绝行为可能引发相邻正常智能体之间的异质瞬态演化,进而导致网络级演化失同步。为缓解该效应,进一步开发了双时间尺度自适应演化调节架构,其中外层动态调整长期演化速率调度,内层保持鲁棒在线学习。理论分析建立了外层更新的动态追踪特性,并证明所提调节机制能削弱严格拒绝行为诱发的演化失同步传播。数值仿真与卫星辅助物联网监控场景表明,该方法在持续恶意干扰下可实现鲁棒且低成本的分布式在线学习。