We investigate the control and optimization of vertical federated learning (VFL), a class of distributed machine learning (ML) methods in which edge/fog devices contain separate data features, in dynamic edge/fog networks. Owing to heterogeneous data features and hardware across edge/fog networks, devices' contributions to VFL vary substantially, and, moreover, dynamic edge/fog networks can lead to the permanent exit or entry of select data features. In this setting, our proposed methodology, server controlled VFL in dynamic networks (SC-DN), first establishes the existence of a global first-order stationary point for every global round, and then leverages this result to jointly optimize ML model training and resource consumption based on four key control variables: (i) server placement, (ii) device-to-server transmit power, (iii) local device processor frequency, and (iv) local training iterations per global round. The resulting optimization formulation contains coupled variables as well as numerous forms of logarithmic constraints which we show is a mixed-integer signomial program, an NP-hard problem, and for which we develop a general solver. Finally, via experiments on both image and multi-modal datasets, we show that our methodology demonstrates superior classification/regression performance and resource consumption savings than even greedy methodologies.
翻译:我们研究了动态边缘/雾网络中纵向联邦学习(VFL)的控制与优化问题——这是一类边缘/雾设备包含不同数据特征的分布式机器学习(ML)方法。由于边缘/雾网络中异构的数据特征和硬件设备,各设备对纵向联邦学习的贡献差异显著,并且动态网络环境可能导致部分数据特征的永久退出或新增。针对这一场景,我们提出的方法——动态网络中服务器控制的纵向联邦学习(SC-DN)——首先证明了每个全局轮次中全局一阶驻点的存在性,进而利用该结论基于四个关键控制变量联合优化机器学习模型训练与资源消耗:(i)服务器部署位置,(ii)设备到服务器的传输功率,(iii)本地设备处理器频率,(iv)每全局轮次的本地训练迭代次数。由此生成的优化问题包含耦合变量及多种对数约束形式,我们证明其为混合整数符号规划问题——属于NP难问题,并为此开发了通用求解器。最后,在图像和多模态数据集上的实验表明,相较于贪婪方法,我们的方法在分类/回归性能及资源消耗节省方面均具有显著优势。