In this work, we study the problem of aggregation in the context of Bayesian Federated Learning (BFL). Using an information geometric perspective, we interpret the BFL aggregation step as finding the barycenter of the trained posteriors for a pre-specified divergence metric. We study the barycenter problem for the parametric family of $\alpha$-divergences and, focusing on the standard case of independent and Gaussian distributed parameters, we recover the closed-form solution of the reverse Kullback-Leibler barycenter and develop the analytical form of the squared Wasserstein-2 barycenter. Considering a non-IID setup, where clients possess heterogeneous data, we analyze the performance of the developed algorithms against state-of-the-art (SOTA) Bayesian aggregation methods in terms of accuracy, uncertainty quantification (UQ), model calibration (MC), and fairness. Finally, we extend our analysis to the framework of Hybrid Bayesian Deep Learning (HBDL), where we study how the number of Bayesian layers in the architecture impacts the considered performance metrics. Our experimental results show that the proposed methodology presents comparable performance with the SOTA while offering a geometric interpretation of the aggregation phase.
翻译:本文研究贝叶斯联邦学习(BFL)中的聚合问题。通过信息几何视角,我们将BFL聚合步骤解释为在预设散度度量下寻找训练后验分布的重心。针对α-散度的参数族,我们研究了重心问题,并聚焦于参数独立且服从高斯分布的标准情形,推导出反向Kullback-Leibler重心的闭式解,并建立了平方Wasserstein-2重心的解析形式。在非独立同分布场景下(客户端数据具有异质性),我们从准确率、不确定性量化(UQ)、模型校准(MC)和公平性四个维度,将所提算法与最先进的贝叶斯聚合方法进行性能对比。最后,我们将分析拓展至混合贝叶斯深度学习(HBDL)框架,探究架构中贝叶斯层数量对上述性能指标的影响。实验结果表明,所提方法在保持与最先进方法相当性能的同时,为聚合阶段提供了几何解释。