Federated learning (FL) is the promising privacy-preserve approach to continually update the central machine learning (ML) model (e.g., object detectors in edge servers) by aggregating the gradients obtained from local observation data in distributed connected and automated vehicles (CAVs). The incentive mechanism is to incentivize individual selfish CAVs to participate in FL towards the improvement of overall model accuracy. It is, however, challenging to design the incentive mechanism, due to the complex correlation between the overall model accuracy and unknown incentive sensitivity of CAVs, especially under the non-independent and identically distributed (Non-IID) data of individual CAVs. In this paper, we propose a new learn-to-incentivize algorithm to adaptively allocate rewards to individual CAVs under unknown sensitivity functions. First, we gradually learn the unknown sensitivity function of individual CAVs with accumulative observations, by using compute-efficient Gaussian process regression (GPR). Second, we iteratively update the reward allocation to individual CAVs with new sampled gradients, derived from GPR. Third, we project the updated reward allocations to comply with the total budget. We evaluate the performance of extensive simulations, where the simulation parameters are obtained from realistic profiling of the CIFAR-10 dataset and NVIDIA RTX 3080 GPU. The results show that our proposed algorithm substantially outperforms existing solutions, in terms of accuracy, scalability, and adaptability.
翻译:摘要:联邦学习是一种有前景的隐私保护方法,通过聚合分布式联网自动驾驶车辆(CAVs)的本地观测数据所获得的梯度,持续更新中央机器学习模型(例如,边缘服务器中的目标检测器)。激励机制旨在激励自私的单个CAVs参与联邦学习,以提升整体模型精度。然而,由于整体模型精度与CAVs未知的激励敏感性之间存在复杂关联,尤其是在单个CAVs的非独立同分布(Non-IID)数据条件下,设计激励机制颇具挑战。本文提出了一种新的学习激励机制算法,用于在未知敏感性函数下自适应地向单个CAVs分配奖励。首先,我们通过计算高效的高斯过程回归(GPR),利用累积观测逐步学习单个CAVs的未知敏感性函数。其次,我们基于从GPR获得的新采样梯度,迭代更新对单个CAVs的奖励分配。最后,我们对更新后的奖励分配进行投影,以符合总预算。我们通过广泛仿真评估了性能,其中仿真参数来自CIFAR-10数据集和NVIDIA RTX 3080 GPU的实际配置。结果显示,在精度、可扩展性和适应性方面,我们提出的算法显著优于现有解决方案。