This paper focuses on the performance of equalizer zero-determinant (ZD) strategies in discounted repeated Stackerberg asymmetric games. In the leader-follower adversarial scenario, the strong Stackelberg equilibrium (SSE) deriving from the opponents' best response (BR), is technically the optimal strategy for the leader. However, computing an SSE strategy may be difficult since it needs to solve a mixed-integer program and has exponential complexity in the number of states. To this end, we propose to adopt an equalizer ZD strategy, which can unilaterally restrict the opponent's expected utility. We first study the existence of an equalizer ZD strategy with one-to-one situations, and analyze an upper bound of its performance with the baseline SSE strategy. Then we turn to multi-player models, where there exists one player adopting an equalizer ZD strategy. We give bounds of the sum of opponents' utilities, and compare it with the SSE strategy. Finally, we give simulations on unmanned aerial vehicles (UAVs) and the moving target defense (MTD) to verify the effectiveness of our approach.
翻译:本文聚焦于折扣重复Stackelberg非对称博弈中均衡器零行列式(ZD)策略的性能表现。在领导者-追随者对抗场景中,基于对手最优响应(BR)的强Stackelberg均衡(SSE)在技术上是领导者的最优策略。然而,计算SSE策略可能较为困难,因为它需要求解混合整数规划,且其复杂度随状态数量呈指数增长。为此,我们提出采用均衡器ZD策略,该策略可单方面限制对手的期望效用。我们首先研究一对一场景中均衡器ZD策略的存在性,并以SSE基线策略为基准分析其性能上界。随后转向多参与者模型,其中存在一名参与者采用均衡器ZD策略。我们给出对手效用之和的界,并与SSE策略进行对比。最后,我们在无人机(UAV)和移动目标防御(MTD)场景中开展仿真实验,验证所提方法的有效性。