With rapid technological growth, security attacks are drastically increasing. In many crucial Internet-of-Things (IoT) applications such as healthcare and defense, the early detection of security attacks plays a significant role in protecting huge resources. An intrusion detection system is used to address this problem. The signature-based approaches fail to detect zero-day attacks. So anomaly-based detection particularly AI tools, are becoming popular. In addition, the imbalanced dataset leads to biased results. In Machine Learning (ML) models, F1 score is an important metric to measure the accuracy of class-level correct predictions. The model may fail to detect the target samples if the F1 is considerably low. It will lead to unrecoverable consequences in sensitive applications such as healthcare and defense. So, any improvement in the F1 score has significant impact on the resource protection. In this paper, we present a framework for ML-based intrusion detection system for an imbalanced dataset. In this study, the most recent dataset, namely CICIoT2023 is considered. The random forest (RF) algorithm is used in the proposed framework. The proposed approach improves 3.72%, 3.75% and 4.69% in precision, recall and F1 score, respectively, with the existing method. Additionally, for unsaturated classes (i.e., classes with F1 score < 0.99), F1 score improved significantly by 7.9%. As a result, the proposed approach is more suitable for IoT security applications for efficient detection of intrusion and is useful in further studies.
翻译:随着技术的快速发展,安全攻击急剧增加。在医疗保健和国防等许多关键的物联网(IoT)应用中,早期检测安全攻击在保护庞大资源方面起着重要作用。入侵检测系统用于解决这一问题。基于签名的检测方法无法检测零日攻击,因此基于异常检测特别是人工智能工具的检测方法日益流行。此外,不平衡的数据集会导致有偏结果。在机器学习(ML)模型中,F1分数是衡量类别级正确预测准确性的重要指标。如果F1分数显著偏低,模型可能无法检测到目标样本,这在医疗和国防等敏感应用中会导致不可挽回的后果。因此,F1分数的任何改进对资源保护都具有重大意义。本文提出了一种基于机器学习的入侵检测系统框架,适用于不平衡数据集。本研究采用了最新数据集CICIoT2023,并在所提框架中使用随机森林(RF)算法。与现有方法相比,所提方法在精确率、召回率和F1分数上分别提升了3.72%、3.75%和4.69%。此外,对于非饱和类别(即F1分数<0.99的类别),F1分数显著提升了7.9%。因此,所提方法更适用于物联网安全应用,能够高效检测入侵,并有助于进一步研究。