The rapid development of network technologies and industrial intelligence has augmented the connectivity and intelligence within the automotive industry. Notably, in the Internet of Vehicles (IoV), the Controller Area Network (CAN), which is crucial for the communication of electronic control units but lacks inbuilt security measures, has become extremely vulnerable to severe cybersecurity threats. Meanwhile, the efficacy of Intrusion Detection Systems (IDS) is hampered by the scarcity of sufficient attack data for robust model training. To overcome this limitation, we introduce a novel methodology leveraging the Restricted Boltzmann Machine (RBM) to generate synthetic CAN attack data, thereby producing training datasets with a more balanced sample distribution. Specifically, we design a CAN Data Processing Module for transforming raw CAN data into an RBM-trainable format, and a Negative Sample Generation Module to generate data reflecting the distribution of CAN data frames denoting network intrusions. Experimental results show the generated data significantly improves IDS performance, with CANet accuracy rising from 0.6477 to 0.9725 and EfficientNet from 0.1067 to 0.1555. Code is available at https://github.com/wangkai-tech23/CANDataSynthetic.
翻译:随着网络技术与工业智能化的快速发展,汽车产业的互联性与智能化水平显著提升。值得注意的是,在车联网环境中,控制器局域网作为电子控制单元间通信的关键协议,因其缺乏内置安全机制,正面临日益严峻的网络安全威胁。与此同时,入侵检测系统的有效性常因攻击数据不足而受到制约,难以支撑稳健的模型训练。为突破此限制,本文提出一种基于受限玻尔兹曼机生成合成CAN攻击数据的新方法,从而构建样本分布更均衡的训练数据集。具体而言,我们设计了CAN数据处理模块,将原始CAN数据转换为适用于RBM训练的形式;并构建负样本生成模块,以生成反映网络入侵特征的CAN数据帧分布。实验结果表明,所生成数据显著提升了入侵检测系统性能:CANet准确率从0.6477提升至0.9725,EfficientNet准确率从0.1067提升至0.1555。代码开源地址:https://github.com/wangkai-tech23/CANDataSynthetic。