AMDS: Attack-Aware Multi-Stage Defense System for Network Intrusion Detection with Two-Stage Adaptive Weight Learning

Machine learning based network intrusion detection systems are vulnerable to adversarial attacks that degrade classification performance under both gradient-based and distribution shift threat models. Existing defenses typically apply uniform detection strategies, which may not account for heterogeneous attack characteristics. This paper proposes an attack-aware multi-stage defense framework that learns attack-specific detection strategies through a weighted combination of ensemble disagreement, predictive uncertainty, and distributional anomaly signals. Empirical analysis across seven adversarial attack types reveals distinct detection signatures, enabling a two-stage adaptive detection mechanism. Experimental evaluation on a benchmark intrusion detection dataset indicates that the proposed system attains 94.2% area under the receiver operating characteristic curve and improves classification accuracy by 4.5 percentage points and F1-score by 9.0 points over adversarially trained ensembles. Under adaptive white-box attacks with full architectural knowledge, the system appears to maintain 94.4% accuracy with a 4.2% attack success rate, though this evaluation is limited to two adaptive variants and does not constitute a formal robustness guarantee. Cross-dataset validation further suggests that defense effectiveness depends on baseline classifier competence and may vary with feature dimensionality. These results suggest that attack-specific optimization combined with multi-signal integration can provide a practical approach to improving adversarial robustness in machine learning-based intrusion detection systems.

翻译：基于机器学习的网络入侵检测系统易受对抗性攻击影响，在基于梯度的威胁模型和分布偏移威胁模型下均会出现分类性能下降。现有防御方法通常采用统一的检测策略，可能无法应对异构攻击特征。本文提出一种攻击感知的多阶段防御框架，通过集成学习分歧、预测不确定性和分布异常信号的加权组合，学习针对特定攻击的检测策略。对七种对抗攻击类型的实证分析揭示了不同的检测特征，从而实现了两阶段自适应检测机制。在基准入侵检测数据集上的实验评估表明，所提系统在接收者操作特征曲线下面积达到94.2%，相比对抗训练集成方法将分类准确率提升4.5个百分点，F1分数提高9.0分。在具备完整架构知识的自适应白盒攻击下，系统仍能保持94.4%的准确率且攻击成功率仅为4.2%，但该评估仅限于两种自适应变体且不构成正式的鲁棒性保证。跨数据集验证进一步表明，防御效果取决于基线分类器能力，并可能随特征维度变化。这些结果表明，针对特定攻击的优化与多信号集成相结合，可为提升基于机器学习的入侵检测系统的对抗鲁棒性提供实用途径。