Machine Learning (ML) algorithms are generally designed for scenarios in which all data is stored in one data center, where the training is performed. However, in many applications, e.g., in the healthcare domain, the training data is distributed among several entities, e.g., different hospitals or patients' mobile devices/sensors. At the same time, transferring the data to a central location for learning is certainly not an option, due to privacy concerns and legal issues, and in certain cases, because of the communication and computation overheads. Federated Learning (FL) is the state-of-the-art collaborative ML approach for training an ML model across multiple parties holding local data samples, without sharing them. However, enabling learning from distributed data over such edge Internet of Things (IoT) systems (e.g., mobile-health and wearable technologies, involving sensitive personal/medical data) in a privacy-preserving fashion presents a major challenge mainly due to their stringent resource constraints, i.e., limited computing capacity, communication bandwidth, memory storage, and battery lifetime. In this paper, we propose a privacy-preserving edge FL framework for resource-constrained mobile-health and wearable technologies over the IoT infrastructure. We evaluate our proposed framework extensively and provide the implementation of our technique on Amazon's AWS cloud platform based on the seizure detection application in epilepsy monitoring using wearable technologies.
翻译:机器学习(ML)算法通常设计用于所有数据集中存储于单一数据中心进行训练的场景。然而,在诸多应用中,例如医疗健康领域,训练数据分布在多个实体之间,如不同医院或患者的移动设备/传感器。同时,由于隐私顾虑和法律问题,以及在某些情况下的通信和计算开销,将数据传输至中央位置进行学习显然不可行。联邦学习(FL)是目前最先进的协作式ML方法,能够在多个持有本地数据样本的参与方之间训练ML模型,且无需共享这些样本。然而,在资源受限的边缘物联网(IoT)系统(例如涉及敏感个人/医疗数据的移动健康与可穿戴技术)中,以隐私保护方式实现分布式数据的学习,主要面临严峻挑战,原因在于这些系统存在严格的计算能力、通信带宽、内存存储和电池续航等资源约束。本文提出了一种面向物联网基础设施上资源受限的移动健康与可穿戴技术的隐私保护边缘联邦学习框架。我们对所提框架进行了全面评估,并基于可穿戴技术在癫痫监测中的癫痫发作检测应用,在亚马逊AWS云平台上实现了该技术方案。