An Intrusion detection system (IDS) is essential for avoiding malicious activity. Mostly, IDS will be improved by machine learning approaches, but the model efficiency is degrading because of more headers (or features) present in the packet (each record). The proposed model extracts practical features using Non-negative matrix factorization and chi-square analysis. The more number of features increases the exponential time and risk of overfitting the model. Using both techniques, the proposed model makes a hierarchical approach that will reduce the features quadratic error and noise. The proposed model is implemented on three publicly available datasets, which gives significant improvement. According to recent research, the proposed model has improved performance by 4.66% and 0.39% with respective NSL-KDD and CICD 2017.
翻译:入侵检测系统是防范恶意活动的重要工具。目前,入侵检测系统主要通过机器学习方法进行改进,但由于数据包(每条记录)中存在过多头部信息(或特征),导致模型效率持续下降。本文提出一种采用非负矩阵分解与卡方分析的实用特征提取模型。过多的特征不仅会指数级增加计算时间,还会加剧模型过拟合风险。通过融合这两种技术,本文模型构建了分级处理框架,能够有效降低特征的二次误差与噪声。该模型在三个公开数据集上进行了验证,并取得了显著性能提升。最新研究显示,与原始数据相比,本文模型在NSL-KDD和CICD 2017数据集上的检测性能分别提升了4.66%和0.39%。