In the ever-changing world of technology, continuous authentication and comprehensive access management are essential during user interactions with a device. Split Learning (SL) and Federated Learning (FL) have recently emerged as promising technologies for training a decentralized Machine Learning (ML) model. With the increasing use of smartphones and Internet of Things (IoT) devices, these distributed technologies enable users with limited resources to complete neural network model training with server assistance and collaboratively combine knowledge between different nodes. In this study, we propose combining these technologies to address the continuous authentication challenge while protecting user privacy and limiting device resource usage. However, the model's training is slowed due to SL sequential training and resource differences between IoT devices with different specifications. Therefore, we use a cluster-based approach to group devices with similar capabilities to mitigate the impact of slow devices while filtering out the devices incapable of training the model. In addition, we address the efficiency and robustness of training ML models by using SL and FL techniques to train the clients simultaneously while analyzing the overhead burden of the process. Following clustering, we select the best set of clients to participate in training through a Genetic Algorithm (GA) optimized on a carefully designed list of objectives. The performance of our proposed framework is compared to baseline methods, and the advantages are demonstrated using a real-life UMDAA-02-FD face detection dataset. The results show that CRSFL, our proposed approach, maintains high accuracy and reduces the overhead burden in continuous authentication scenarios while preserving user privacy.
翻译:在日新月异的技术发展背景下,用户与设备交互过程中的持续认证与全面访问管理至关重要。分割学习与联邦学习近年来已成为训练去中心化机器学习模型的前沿技术。随着智能手机与物联网设备的普及,这些分布式技术使资源受限用户能够借助服务器辅助完成神经网络模型训练,并协同整合不同节点间的知识。本研究提出融合这些技术,在保障用户隐私与限制设备资源消耗的前提下解决持续认证难题。然而,由于分割学习的串行训练特性以及不同规格物联网设备间的资源差异,模型训练速度受到制约。为此,我们采用基于聚类的方法将能力相近的设备分组,以减弱慢速设备的影响,同时过滤掉无法承担模型训练任务的设备。此外,通过同步运用分割学习与联邦学习技术训练客户端,并分析过程开销,我们解决了机器学习模型训练的效能与鲁棒性问题。完成聚类后,我们采用基于精心设计的目标列表优化的遗传算法,遴选最优客户端集参与训练。将所提框架与基线方法进行性能对比,并利用真实UMDAA-02-FD人脸检测数据集展示其优势。实验结果表明,本研究的CRSFL方法在保持高精度的同时,能够在持续认证场景中降低开销负担,并保障用户隐私。