This paper is concerned with the optimal allocation of detection resources (sensors) to mitigate multi-stage attacks, in the presence of the defender's uncertainty in the attacker's intention. We model the attack planning problem using a Markov decision process and characterize the uncertainty in the attacker's intention using a finite set of reward functions -- each reward represents a type of the attacker. Based on this modeling framework, we employ the paradigm of the worst-case absolute regret minimization from robust game theory and develop mixed-integer linear program (MILP) formulations for solving the worst-case regret minimizing sensor allocation strategies for two classes of attack-defend interactions: one where the defender and attacker engage in a zero-sum game, and another where they engage in a non-zero-sum game. We demonstrate the effectiveness of our framework using a stochastic gridworld example.
翻译:本文研究在防御者对攻击者意图存在不确定性的情况下,如何优化分配检测资源(传感器)以缓解多阶段攻击问题。我们采用马尔可夫决策过程对攻击规划问题进行建模,并利用有限个奖励函数集合来描述攻击者意图的不确定性——每个奖励函数代表一种攻击者类型。基于该建模框架,我们借鉴鲁棒博弈论中的最坏情况绝对遗憾最小化范式,针对两类攻防互动场景:防御者与攻击者分别参与零和博弈与非零和博弈,提出了求解最坏情况遗憾最小化传感器分配策略的混合整数线性规划(MILP)公式。通过随机网格世界示例验证了该框架的有效性。