Vertical federated learning (VFL) has recently emerged as an appealing distributed paradigm empowering multi-party collaboration for training high-quality models over vertically partitioned datasets. Gradient boosting has been popularly adopted in VFL, which builds an ensemble of weak learners (typically decision trees) to achieve promising prediction performance. Recently there have been growing interests in using decision table as an intriguing alternative weak learner in gradient boosting, due to its simpler structure, good interpretability, and promising performance. In the literature, there have been works on privacy-preserving VFL for gradient boosted decision trees, but no prior work has been devoted to the emerging case of decision tables. Training and inference on decision tables are different from that the case of generic decision trees, not to mention gradient boosting with decision tables in VFL. In light of this, we design, implement, and evaluate Privet, the first system framework enabling privacy-preserving VFL service for gradient boosted decision tables. Privet delicately builds on lightweight cryptography and allows an arbitrary number of participants holding vertically partitioned datasets to securely train gradient boosted decision tables. Extensive experiments over several real-world datasets and synthetic datasets demonstrate that Privet achieves promising performance, with utility comparable to plaintext centralized learning.
翻译:纵向联邦学习(VFL)近期作为一种有前景的分布式范式出现,支持多方协作在纵向划分的数据集上训练高质量模型。梯度提升在VFL中被广泛采用,通过集成多个弱学习器(通常是决策树)来实现出色的预测性能。近年来,决策表因其结构更简单、可解释性强且性能优秀,作为梯度提升中一种引人注目的替代弱学习器而受到日益关注。现有文献中已有关于梯度提升决策树的隐私保护VFL研究,但尚无工作针对新兴的决策表情形。决策表的训练与推理过程与通用决策树不同,更遑论在VFL中结合决策表进行梯度提升。为此,我们设计、实现并评估了Privet——首个支持面向梯度提升决策表的隐私保护VFL服务的系统框架。Privet巧妙基于轻量级密码学,允许任意数量持有纵向划分数据集的参与者安全训练梯度提升决策表。在多个真实数据集与合成数据集上的大量实验表明,Privet实现了与明文集中学习相当的性能与实用性。