Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods which collect parameter samples from local posteriors, and aggregate them to approximate the global posterior. To improve scalability for larger models, one common Bayesian approach is to approximate the global predictive posterior by multiplying local predictive posteriors. In this work, we demonstrate that this method gives systematically overconfident predictions, and we remedy this by proposing $\beta$-Predictive Bayes, a Bayesian FL algorithm that interpolates between a mixture and product of the predictive posteriors, using a tunable parameter $\beta$. This parameter is tuned to improve the global ensemble's calibration, before it is distilled to a single model. Our method is evaluated on a variety of regression and classification datasets to demonstrate its superiority in calibration to other baselines, even as data heterogeneity increases. Code available at https://github.com/hasanmohsin/betaPredBayesFL
翻译:联邦学习(FL)涉及在分布于各客户端的数据集上训练模型,约束条件是每个客户端的数据集均保持本地化且可能存在异质性。在FL中,小规模且含噪声的数据集十分常见,这凸显了对能够表征预测不确定性且校准良好的模型的需求。最接近实现此类目标的FL技术是贝叶斯FL方法:该方法从局部后验中采集参数样本,并通过聚合这些样本来近似全局后验。为提升较大模型的可扩展性,一种常见的贝叶斯方法是通过局部预测后验的乘积来近似全局预测后验。在本工作中,我们证明该方法会产生系统性的过度自信预测,并通过提出β-预测贝叶斯(β-Predictive Bayes)来修正这一问题——这是一种利用可调参数β在预测后验的混合与乘积之间进行插值的贝叶斯FL算法。该参数经调优以改善全局集成的校准效果,随后通过蒸馏将集成模型压缩为单一模型。我们在多种回归与分类数据集上评估了该方法,结果表明其在校准性能上优于其他基线方法,即便在数据异质性增加时亦如此。代码见https://github.com/hasanmohsin/betaPredBayesFL