The Internet of Things (IoT) integrates more than billions of intelligent devices over the globe with the capability of communicating with other connected devices with little to no human intervention. IoT enables data aggregation and analysis on a large scale to improve life quality in many domains. In particular, data collected by IoT contain a tremendous amount of information for anomaly detection. The heterogeneous nature of IoT is both a challenge and an opportunity for cybersecurity. Traditional approaches in cybersecurity monitoring often require different kinds of data pre-processing and handling for various data types, which might be problematic for datasets that contain heterogeneous features. However, heterogeneous types of network devices can often capture a more diverse set of signals than a single type of device readings, which is particularly useful for anomaly detection. In this paper, we present a comprehensive study on using ensemble machine learning methods for enhancing IoT cybersecurity via anomaly detection. Rather than using one single machine learning model, ensemble learning combines the predictive power from multiple models, enhancing their predictive accuracy in heterogeneous datasets rather than using one single machine learning model. We propose a unified framework with ensemble learning that utilises Bayesian hyperparameter optimisation to adapt to a network environment that contains multiple IoT sensor readings. Experimentally, we illustrate their high predictive power when compared to traditional methods.
翻译:物联网(IoT)将全球超过数十亿的智能设备整合在一起,这些设备能够在极少或无需人工干预的情况下与其他连接设备通信。物联网支持大规模数据聚合与分析,从而在众多领域提升生活质量。特别是,物联网收集的数据包含大量用于异常检测的信息。物联网的异构特性对网络安全既是挑战也是机遇。传统的网络安全监控方法通常需要针对不同数据类型进行不同的数据预处理和处理,这对于包含异构特征的数据集可能存在问题。然而,异构类型的网络设备通常能够比单一设备读数捕获更多样化的信号,这对异常检测尤其有用。本文系统地研究了利用集成机器学习方法通过异常检测增强物联网网络安全。不同于使用单一机器学习模型,集成学习结合多个模型的预测能力,在异构数据集中提升其预测精度。我们提出一个统一的集成学习框架,利用贝叶斯超参数优化来适应包含多类物联网传感器读数的网络环境。通过实验,我们展示了该方法相比传统方法具有更高的预测能力。