Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS) deployed in energy infrastructure are vulnerable to model theft attacks, which allow adversaries to create evasive traffic offline. Current defences against model extraction rely either on identity-bound query monitoring, which is ineffective against distributed attackers (Sybil), or on prediction poisoning through soft-label perturbation, which is inapplicable to hard-label IDS deployments. Therefore, we propose FlowGuard, an identity-independent defence based on flow matching that classifies incoming queries as out-of-distribution (OOD) prior to IDS processing. This approach exploits the fact that queries generated synthetically for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic. This results in measurably lower log-likelihoods when using a Continuous Normalizing Flow that has been trained on legitimate data. We evaluate our method against PRADA and FDINet using MAZE and DisGUIDE attacks in single-client and distributed (100-client Sybil) settings. While PRADA's detection rate dropped to 0% when the distribution changed, our defence maintained a stable detection rate across both settings without relying on identity information. We discuss the scope and limitations of the approach, and outline potential applications to data-dependent attacks.
翻译:摘要:部署于能源基础设施中的人工智能(AI)入侵检测系统(IDS)易受模型窃取攻击,攻击者可据此离线生成规避检测的流量。当前针对模型提取的防御手段要么依赖于身份绑定的查询监控(无法有效应对分布式攻击者Sybil),要么通过软标签扰动实施预测中毒(不适用于硬标签IDS部署场景)。为此,我们提出FlowGuard——一种基于流匹配的身份无关防御机制,在IDS处理前将入站查询分类为分布外(OOD)样本。该方法的核心思想在于:面向无数据模型窃取攻击的合成查询流量,其流形维度低于真实网络流量。当使用基于合法数据训练的连续归一化流模型时,将导致其对数似然值显著下降。我们在单客户端与分布式(100客户端Sybil)场景下,分别采用MAZE和DisGUIDE攻击方法,对PRADA和FDINet基线进行对比评估。实验表明:当数据分布改变时,PRADA的检测率降至0%,而本防御机制无需依赖身份信息,在两种场景下均保持稳定的检测率。最后,我们讨论了该方法的适用范围与局限性,并指出了其在数据依赖型攻击场景中的潜在应用方向。