Machine learning, statistical-based, and knowledge-based methods are often used to implement an Anomaly-based Intrusion Detection System which is software that helps in detecting malicious and undesired activities in the network primarily through the Internet. Machine learning comprises Supervised, Semi-Supervised, and Unsupervised Learning algorithms. Supervised machine learning uses a trained label dataset. This paper uses four supervised learning algorithms Random Forest, XGBoost, K-Nearest Neighbours, and Artificial Neural Network to test the performance of the public dataset. Based on the prediction accuracy rate, the results show that Random Forest performs better on multi-class Intrusion Detection System, followed by XGBoost, K-Nearest Neighbours respective, provided prediction accuracy is taken into perspective. Otherwise, K-Nearest Neighbours was the best performer considering the time of training as the metric. It concludes that Random Forest is the best-supervised machine learning for Intrusion Detection System
翻译:基于机器学习、统计学和知识库的方法常用于构建异常入侵检测系统,该软件主要通过互联网帮助检测网络中的恶意和非预期行为。机器学习包含监督学习、半监督学习和无监督学习算法,其中监督学习使用带标注的数据集。本文采用随机森林(Random Forest)、XGBoost、K近邻(K-Nearest Neighbours)和人工神经网络(Artificial Neural Network)四种监督学习算法,对公开数据集的性能进行测试。基于预测准确率的结果表明,在多类入侵检测系统中,随机森林表现最优,其次依次为XGBoost和K近邻。若以训练时间作为评价指标,则K近邻表现最佳。研究结论认为,随机森林是适用于入侵检测系统的最优监督机器学习算法。