Federated Learning (FL) has been recently receiving increasing consideration from the cybersecurity community as a way to collaboratively train deep learning models with distributed profiles of cyber threats, with no disclosure of training data. Nevertheless, the adoption of FL in cybersecurity is still in its infancy, and a range of practical aspects have not been properly addressed yet. Indeed, the Federated Averaging algorithm at the core of the FL concept requires the availability of test data to control the FL process. Although this might be feasible in some domains, test network traffic of newly discovered attacks cannot be always shared without disclosing sensitive information. In this paper, we address the convergence of the FL process in dynamic cybersecurity scenarios, where the trained model must be frequently updated with new recent attack profiles to empower all members of the federation with the latest detection features. To this aim, we propose FLAD (adaptive Federated Learning Approach to DDoS attack detection), an FL solution for cybersecurity applications based on an adaptive mechanism that orchestrates the FL process by dynamically assigning more computation to those members whose attacks profiles are harder to learn, without the need of sharing any test data to monitor the performance of the trained model. Using a recent dataset of DDoS attacks, we demonstrate that FLAD outperforms state-of-the-art FL algorithms in terms of convergence time and accuracy across a range of unbalanced datasets of heterogeneous DDoS attacks. We also show the robustness of our approach in a realistic scenario, where we retrain the deep learning model multiple times to introduce the profiles of new attacks on a pre-trained model.
翻译:联邦学习(FL)近年来日益受到网络安全领域的关注,它能够在无需泄露训练数据的情况下,通过分布式威胁画像协同训练深度学习模型。然而,FL在网络安全领域的应用仍处于早期阶段,诸多实践性问题尚未得到妥善解决。具体而言,作为FL核心概念的联邦平均算法需要利用测试数据来控制FL过程。尽管这在某些领域可能可行,但针对新发现攻击的测试网络流量往往无法在不泄露敏感信息的前提下共享。本文探讨了动态网络安全场景中FL过程的收敛问题——在该场景下,需频繁使用最新的攻击画像更新训练模型,从而为联邦全体成员赋能最新的检测能力。为此,我们提出FLAD(自适应联邦学习DDoS攻击检测方法)——一种基于自适应机制的网络安全FL解决方案。该机制通过动态分配更多计算资源给攻击画像更难学习的联邦成员来协同FL过程,且无需共享任何测试数据即可监测训练模型的性能。基于最新的DDoS攻击数据集,我们证明FLAD在异构DDoS攻击的非均衡数据集上,其收敛时间与准确率均优于现有最先进的FL算法。我们还通过在实际场景中多次重新训练深度学习模型,将新攻击画像引入预训练模型,验证了本方法的鲁棒性。