Flow-based Detection of Botnets through Bio-inspired Optimisation of Machine Learning

Botnets could autonomously infect, propagate, communicate and coordinate with other members in the botnet, enabling cybercriminals to exploit the cumulative computing and bandwidth of its bots to facilitate cybercrime. Traditional detection methods are becoming increasingly unsuitable against various network-based detection evasion methods. These techniques ultimately render signature-based fingerprinting detection infeasible and thus this research explores the application of network flow-based behavioural modelling to facilitate the binary classification of bot network activity, whereby the detection is independent of underlying communications architectures, ports, protocols and payload-based detection evasion mechanisms. A comparative evaluation of various machine learning classification methods is conducted, to precisely determine the average accuracy of each classifier on bot datasets like CTU-13, ISOT 2010 and ISCX 2014. Additionally, hyperparameter tuning using Genetic Algorithm (GA), aiming to efficiently converge to the fittest hyperparameter set for each dataset was done. The bioinspired optimisation of Random Forest (RF) with GA achieved an average accuracy of 99.85% when it was tested against the three datasets. The model was then developed into a software product. The YouTube link of the project and demo of the software developed: https://youtu.be/gNQjC91VtOI

翻译：僵尸网络能够自主感染、传播、通信并与网络内其他成员协调，使网络犯罪分子得以利用其僵尸节点的累积计算资源与带宽实施网络犯罪。面对各类基于网络的检测规避手段，传统检测方法日益失效。这些技术最终使得基于签名的指纹检测难以实施，因此本研究探索应用基于网络流量的行为建模，以实现对僵尸网络活动的二元分类，从而使检测独立于底层通信架构、端口、协议及基于载荷的检测规避机制。本研究对多种机器学习分类方法进行了比较评估，以精确测定各分类器在CTU-13、ISOT 2010及ISCX 2014等僵尸网络数据集上的平均准确率。此外，研究采用遗传算法（GA）进行超参数调优，旨在针对各数据集高效收敛至最优超参数组合。经仿生优化的随机森林（RF）模型结合GA后，在三个数据集上的测试平均准确率达到99.85%。该模型最终被开发为软件产品。项目介绍与软件演示的YouTube链接为：https://youtu.be/gNQjC91VtOI

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日