SecureBoost is a tree-boosting algorithm leveraging homomorphic encryption to protect data privacy in vertical federated learning setting. It is widely used in fields such as finance and healthcare due to its interpretability, effectiveness, and privacy-preserving capability. However, SecureBoost suffers from high computational complexity and risk of label leakage. To harness the full potential of SecureBoost, hyperparameters of SecureBoost should be carefully chosen to strike an optimal balance between utility, efficiency, and privacy. Existing methods either set hyperparameters empirically or heuristically, which are far from optimal. To fill this gap, we propose a Constrained Multi-Objective SecureBoost (CMOSB) algorithm to find Pareto optimal solutions that each solution is a set of hyperparameters achieving optimal tradeoff between utility loss, training cost, and privacy leakage. We design measurements of the three objectives. In particular, the privacy leakage is measured using our proposed instance clustering attack. Experimental results demonstrate that the CMOSB yields not only hyperparameters superior to the baseline but also optimal sets of hyperparameters that can support the flexible requirements of FL participants.
翻译:SecureBoost是一种利用同态加密在纵向联邦学习设置中保护数据隐私的树提升算法。因其可解释性、有效性及隐私保护能力,该算法被广泛应用于金融和医疗等领域。然而,SecureBoost存在计算复杂度高和标签泄露风险的问题。为充分发挥SecureBoost的潜力,需精心选择其超参数,以在效用、效率和隐私之间实现最优平衡。现有方法要么凭经验设置超参数,要么采用启发式方法,这些方法远非最优。为填补这一空白,我们提出了一种约束多目标SecureBoost(CMOSB)算法,用于寻找帕累托最优解,每个解均为一组超参数,可在效用损失、训练成本和隐私泄露之间实现最优权衡。我们设计了三个目标的度量方法。特别地,隐私泄露通过我们提出的实例聚类攻击进行度量。实验结果表明,CMOSB不仅能生成优于基线的超参数,还能提供支持联邦学习参与者灵活需求的最优超参数集。