This paper presents a novel online transfer learning approach in state-based potential games (TL-SbPGs) for distributed self-optimization in manufacturing systems. The approach targets practical industrial scenarios where knowledge sharing among similar players enhances learning in large-scale and decentralized environments. TL-SbPGs enable players to reuse learned policies from others, which improves learning outcomes and accelerates convergence. To accomplish this goal, we develop transfer learning concepts and similarity criteria for players, which offer two distinct settings: (a) predefined similarities between players and (b) dynamically inferred similarities between players during training. The applicability of the SbPG framework to transfer learning is formally established. Furthermore, we present a method to optimize the timing and weighting of knowledge transfer. Experimental results from a laboratory-scale testbed show that TL-SbPGs improve production efficiency and reduce power consumption compared to vanilla SbPGs.
翻译:本文提出了一种新颖的基于状态势博弈的在线迁移学习方法(TL-SbPGs),用于实现制造系统中的分布式自优化。该方法针对实际工业场景,其中相似智能体间的知识共享可提升大规模分散环境下的学习效能。TL-SbPGs使智能体能够复用其他智能体的已学习策略,从而改善学习效果并加速收敛。为实现此目标,我们开发了面向智能体的迁移学习概念与相似性准则,提供两种不同设置:(a)智能体间预定义相似性;(b)训练过程中动态推断的智能体间相似性。本文从形式上论证了SbPG框架适用于迁移学习的理论基础。此外,我们提出了一种优化知识迁移时机与权重的方法。实验室规模测试平台的实验结果表明,与原始SbPGs相比,TL-SbPGs能提高生产效率并降低能耗。