The increasing connectivity and intricate remote access environment have made traditional perimeter-based network defense vulnerable. Zero trust becomes a promising approach to provide defense policies based on agent-centric trust evaluation. However, the limited observations of the agent's trace bring information asymmetry in the decision-making. To facilitate the human understanding of the policy and the technology adoption, one needs to create a zero-trust defense that is explainable to humans and adaptable to different attack scenarios. To this end, we propose a scenario-agnostic zero-trust defense based on Partially Observable Markov Decision Processes (POMDP) and first-order Meta-Learning using only a handful of sample scenarios. The framework leads to an explainable and generalizable trust-threshold defense policy. To address the distribution shift between empirical security datasets and reality, we extend the model to a robust zero-trust defense minimizing the worst-case loss. We use case studies and real-world attacks to corroborate the results.
翻译:日益增长的互联性和复杂的远程访问环境使传统的基于边界网络防御变得脆弱。零信任成为一种基于主体中心信任评估提供防御策略的有前景的方法。然而,对主体轨迹的有限观测导致了决策中的信息不对称。为促进人类对策略的理解及技术采用,需要构建一种对人类可解释且能适应不同攻击场景的零信任防御。为此,我们提出了一种基于部分可观测马尔可夫决策过程(POMDP)和一阶元学习的场景无关零信任防御方法,仅需少量样本场景。该框架产生了一种可解释且可泛化的信任阈值防御策略。为解决经验安全数据集与现实之间的分布偏移问题,我们扩展模型为鲁棒零信任防御,以最小化最坏情况损失。我们通过案例研究和真实世界攻击验证了结果。