Configuring network access control policies in large, complex networks is error-prone and requires significant expert effort. LLMs offer a promising interface for expressing such policies in natural language, but their capability for translating user requests into access policies, and the system architectures best suited to leverage LLMs, remain underexplored. We present an architecture for natural-language access control (NLAC) that uses LLMs to translate user requests into access policies, and introduce NLACBench, a benchmark for evaluating LLM-based intent translation systems in large-scale networks. Our evaluation across multiple state-of-the-art models shows that top-performing LLMs achieve up to 96.9% accuracy in small-network settings, but performance degrades substantially (below 20% for some models) as network size increases. To address this limitation, we identify relevant network components via embedding similarity and construct compact subgraphs that are passed to the LLM. This approach enables scaling to larger networks with up to 98.7% accuracy, while simultaneously reducing inference time, hardware requirements, and operating costs to a constant resource budget. Finally, a case study indicates that top-performing models exhibit largely complementary error patterns, suggesting that intent translation accuracy may be further improved through multi-LLM architectures.
翻译:在大型复杂网络中配置网络访问控制策略容易出错,且需要大量专家投入。大语言模型(LLM)为以自然语言表达此类策略提供了富有前景的接口,但其将用户请求转化为访问策略的能力,以及最适于利用LLM的系统架构,仍待深入探索。我们提出了一种用于自然语言访问控制(NLAC)的架构,该架构利用LLM将用户请求转化为访问策略,并引入了NLACBench——一个用于评估大规模网络中基于LLM的意图转化系统的基准测试。我们在多个最先进模型上的评估表明,表现最佳的LLM在小型网络环境下准确率可达96.9%,但随着网络规模扩大,性能显著下降(部分模型低于20%)。为应对这一局限,我们通过嵌入相似度识别相关网络组件,构建传递给LLM的紧凑子图。该方法可使系统扩展至更大网络,准确率高达98.7%,同时将推理时间、硬件要求和运行成本降低至恒定的资源预算。最后,案例研究表明,表现最佳的模型在错误模式上呈现高度互补性,这表明通过多LLM架构可进一步提升意图转化准确率。