This paper introduces a dataset and an experimental study on Decentralized Federated Learning (DFL) for Internet of Things (IoT) crowdsensing malware detection. The dataset comprises behavioral records from benign and eight malware attacks. A total of 21,582,484 original records were collected from system calls, file system activities, resource usage, kernel events, input/output events, and network records. These records were aggregated into 30-second windows, resulting in 342,106 data records used for model training and evaluation. Experiments on the DFL platform compare traditional Machine Learning (ML), Centralized Federated Learning (CFL), and DFL across different node counts, topologies, and data distributions. Results show that DFL maintains competitive performance while preserving data locality, outperforming CFL in most settings. This dataset provides a solid foundation for studying the security of IoT crowdsensing environments.
翻译:本文介绍了一个面向物联网众感恶意软件检测的去中心化联邦学习数据集及实验研究。该数据集包含良性行为记录及八种恶意软件攻击记录,从系统调用、文件系统活动、资源使用、内核事件、输入/输出事件及网络记录中采集了共计21,582,484条原始记录。这些记录被聚合为30秒窗口,生成了342,106条数据记录用于模型训练与评估。在去中心化联邦学习平台上的实验比较了传统机器学习、中心化联邦学习及去中心化联邦学习在不同节点数量、拓扑结构及数据分布下的表现。结果表明,去中心化联邦学习在保持数据本地性的同时维持了竞争性性能,并在多数配置下优于中心化联邦学习。该数据集为研究物联网众感环境安全性提供了坚实基础。