SecureBoost is a tree-boosting algorithm leveraging homomorphic encryption to protect data privacy in vertical federated learning setting. It is widely used in fields such as finance and healthcare due to its interpretability, effectiveness, and privacy-preserving capability. However, SecureBoost suffers from high computational complexity and risk of label leakage. To harness the full potential of SecureBoost, hyperparameters of SecureBoost should be carefully chosen to strike an optimal balance between utility, efficiency, and privacy. Existing methods either set hyperparameters empirically or heuristically, which are far from optimal. To fill this gap, we propose a Constrained Multi-Objective SecureBoost (CMOSB) algorithm to find Pareto optimal solutions that each solution is a set of hyperparameters achieving optimal tradeoff between utility loss, training cost, and privacy leakage. We design measurements of the three objectives. In particular, the privacy leakage is measured using our proposed instance clustering attack. Experimental results demonstrate that the CMOSB yields not only hyperparameters superior to the baseline but also optimal sets of hyperparameters that can support the flexible requirements of FL participants.
翻译:SecureBoost是一种利用同态加密在纵向联邦学习框架中保护数据隐私的树提升算法,因其可解释性、有效性及隐私保护能力,被广泛应用于金融和医疗等领域。然而,SecureBoost存在计算复杂度高和标签泄露风险的问题。为充分发挥SecureBoost的潜力,需谨慎选择其超参数,以在效用、效率和隐私之间取得最优平衡。现有方法或依据经验或通过启发式方式设定超参数,远未达到最优。为填补这一空白,我们提出一种约束多目标SecureBoost(CMOSB)算法,用于寻找帕累托最优解集,其中每个解是一组能够实现效用损失、训练成本和隐私泄露之间最优权衡的超参数。我们设计了三个目标的度量方法,特别地,利用所提出的实例聚类攻击来度量隐私泄露。实验结果表明,CMOSB不仅能生成优于基准的超参数,还能提供支持联邦学习参与者灵活需求的最优超参数集。