Vertical federate learning (VFL) has recently emerged as an appealing distributed paradigm empowering multi-party collaboration for training high-quality models over vertically partitioned datasets. Gradient boosting has been popularly adopted in VFL, which builds an ensemble of weak learners (typically decision trees) to achieve promising prediction performance. Recently there have been growing interests in using decision table as an intriguing alternative weak learner in gradient boosting, due to its simpler structure, good interpretability, and promising performance. In the literature, there have been works on privacy-preserving VFL for gradient boosted decision trees, but no prior work has been devoted to the emerging case of decision tables. Training and inference on decision tables are different from that the case of generic decision trees, not to mention gradient boosting with decision tables in VFL. In light of this, we design, implement, and evaluate Privet, the first system framework enabling privacy-preserving VFL service for gradient boosted decision tables. Privet delicately builds on lightweight cryptography and allows an arbitrary number of participants holding vertically partitioned datasets to securely train gradient boosted decision tables. Extensive experiments over several real-world datasets and synthetic datasets demonstrate that Privet achieves promising performance, with utility comparable to plaintext centralized learning.
翻译:纵向联邦学习(VFL)作为一种新兴的分布式范式,能够支持多方协作在纵向划分数据集上训练高质量模型。梯度提升在VFL中被广泛采用,通过集成多个弱学习器(通常是决策树)来实现优异的预测性能。近年来,决策表因其结构更简单、可解释性强且性能出色,逐渐成为梯度提升中一种引人关注的弱学习器替代方案。现有文献已针对基于梯度提升决策树的隐私保护VFL展开研究,但尚无工作涉及新兴的决策表场景。决策表的训练与推理与通用决策树存在本质差异,更不用说VFL中基于决策表的梯度提升了。为此,我们设计、实现并评估了Privet——首个支持面向梯度提升决策表的隐私保护VFL服务的系统框架。Privet巧妙基于轻量级密码学,允许多个持有纵向划分数据集的参与者安全训练梯度提升决策表。在多个真实数据集和合成数据集上的广泛实验表明,Privet在保持与明文集中学习相当的效用同时,实现了优异的性能。