Vertical Federated Learning (VFL) attracts increasing attention because it empowers multiple parties to jointly train a privacy-preserving model over vertically partitioned data. Recent research has shown that applying zeroth-order optimization (ZOO) has many advantages in building a practical VFL algorithm. However, a vital problem with the ZOO-based VFL is its slow convergence rate, which limits its application in handling modern large models. To address this problem, we propose a cascaded hybrid optimization method in VFL. In this method, the downstream models (clients) are trained with ZOO to protect privacy and ensure that no internal information is shared. Meanwhile, the upstream model (server) is updated with first-order optimization (FOO) locally, which significantly improves the convergence rate, making it feasible to train the large models without compromising privacy and security. We theoretically prove that our VFL framework converges faster than the ZOO-based VFL, as the convergence of our framework is not limited by the size of the server model, making it effective for training large models with the major part on the server. Extensive experiments demonstrate that our method achieves faster convergence than the ZOO-based VFL framework, while maintaining an equivalent level of privacy protection. Moreover, we show that the convergence of our VFL is comparable to the unsafe FOO-based VFL baseline. Additionally, we demonstrate that our method makes the training of a large model feasible.
翻译:纵向联邦学习(VFL)因支持多方在纵向分割数据上联合训练隐私保护模型而日益受到关注。近期研究表明,引入零阶优化(ZOO)在构建实用VFL算法方面具有诸多优势。然而,基于ZOO的VFL存在收敛速度慢的关键问题,这限制了其处理现代大模型的应用潜力。为解决该问题,我们提出一种级联混合优化方法。在该方法中,下游模型(客户端)采用ZOO训练以保护隐私并确保不共享内部信息;同时,上游模型(服务器)在本地通过一阶优化(FOO)更新,这显著提升了收敛速度,使得在不牺牲隐私和安全性的前提下训练大模型成为可能。我们从理论上证明,该VFL框架的收敛速度优于基于ZOO的VFL,因为其收敛性不受服务器模型规模限制,从而能有效训练服务器端承担主要部分的大模型。大量实验表明,本方法在保持等效隐私保护水平的同时,实现了比基于ZOO的VFL框架更快的收敛速度。此外,我们证明该VFL的收敛性可与不安全的基于FOO的VFL基线相媲美。最后,我们验证了本方法使大模型训练具备可行性。