Cyberattack detection in Critical Infrastructure and Supply Chains has become challenging in Industry 4.0. Intrusion Detection Systems (IDS) are deployed to counter the cyberattacks. However, an IDS effectively detects attacks based on the known signatures and patterns, Zero-day attacks go undetected. To overcome this drawback in IDS, the integration of a Dense Neural Network (DNN) with Data Augmentation is proposed. It makes IDS intelligent and enables it to self-learn with high accuracy when a novel attack is encountered. The network flow captures datasets are highly imbalanced same as the real network itself. The Data Augmentation plays a crucial role in balancing the data. The balancing of data is challenging as the minority class is as low as 0.000004\% of the dataset, and the abundant class is higher than 80\% of the dataset. Synthetic Minority Oversampling Technique is used for balancing the data. However, higher accuracies are achieved with balanced test data, lower accuracies are noticeable with the original imbalanced test data suggesting overfitting. A comparison with state-of-the-art research using Synthetic Minority Oversampling Technique with Edited Nearest Neighbor shows the classification of classes remains poor for the original dataset. This suggests highly imbalanced datasets of network flow require a different method of data augmentation.
翻译:在工业4.0时代,关键基础设施与供应链中的网络攻击检测已变得极具挑战性。入侵检测系统(IDS)被部署以应对网络攻击,然而IDS主要基于已知特征和模式进行有效攻击检测,零日攻击往往无法被识别。为克服IDS的这一缺陷,本文提出将密集神经网络(DNN)与数据增强技术相结合的方法。该方案使IDS智能化,使其在遭遇新型攻击时能够以高精度进行自主学习。网络流量捕获数据集与真实网络环境类似,存在高度不平衡特性。数据增强技术在平衡数据方面起着关键作用,但数据平衡本身具有挑战性——少数类样本仅占数据集的0.000004%,而多数类样本占比超过80%。研究采用合成少数类过采样技术实现数据平衡。实验表明:虽然平衡测试数据可获得较高准确率,但在原始不平衡测试数据上准确率显著下降,表明存在过拟合现象。与采用合成少数类过采样结合编辑最近邻的先进研究对比发现,原始数据集的分类效果仍然较差。这表明网络流量的高度不平衡数据集需要采用不同的数据增强方法。