In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly compromising the controller, thereby posing a direct risk to physical security. Within the framework of hierarchical control and incentive feedback Stackelberg game, we design a resilient leading controller (leader) that is adaptive to a compromised following controller (follower) such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader's objective to achieve a team-optimal solution. First, we provide sufficient conditions for the existence of an incentive Stackelberg solution when system dynamics are known. Then, we propose a Q-learning-based Approximate Dynamic Programming (ADP) approach, and corresponding algorithms for the online resolution of the incentive Stackelberg solution without requiring prior knowledge of system dynamics. Last but not least, we prove the convergence of our approach to the optimum.
翻译:在智能环境中信息物理系统(CPSs)的快速演进中,尤其是在工业4.0塑造的工业领域,其蓬勃发展带来了前所未有的安全挑战。本文探讨了工业信息物理系统(ICPSs)中复杂的安全问题,特别关注能够直接攻破控制器从而对物理安全构成直接威胁的智能攻击者所带来的独特威胁。在分层控制与激励反馈Stackelberg博弈的框架下,我们设计了一个具有弹性的领导控制器(领导者),该控制器能够适应被攻破的跟随控制器(跟随者),使得被攻破的跟随者与领导者协同行动,将其策略与领导者的目标保持一致,以实现团队最优解。首先,我们给出了当系统动力学已知时存在激励Stackelberg解的充分条件。然后,我们提出了一种基于Q学习的近似动态规划(ADP)方法及相应算法,用于在无需系统动力学先验知识的情况下在线求解激励Stackelberg解。最后,我们证明了该方法收敛于最优解。