The Rashomon set captures the collection of models that achieve near-identical empirical performance yet may differ substantially in their decision boundaries. Understanding the differences among these models, i.e., their multiplicity, is recognized as a crucial step toward model transparency, fairness, and robustness, as it reveals decision boundaries instabilities that standard metrics obscure. However, the existing definitions of Rashomon set and multiplicity metrics assume centralized learning and do not extend naturally to decentralized, multi-party settings like Federated Learning (FL). In FL, multiple clients collaboratively train models under a central server's coordination without sharing raw data, which preserves privacy but introduces challenges from heterogeneous client data distribution and communication constraints. In this setting, the choice of a single best model may homogenize predictive behavior across diverse clients, amplify biases, or undermine fairness guarantees. In this work, we provide the first formalization of Rashomon sets in FL.First, we adapt the Rashomon set definition to FL, distinguishing among three perspectives: (I) a global Rashomon set defined over aggregated statistics across all clients, (II) a t-agreement Rashomon set representing the intersection of local Rashomon sets across a fraction t of clients, and (III) individual Rashomon sets specific to each client's local distribution.Second, we show how standard multiplicity metrics can be estimated under FL's privacy constraints. Finally, we introduce a multiplicity-aware FL pipeline and conduct an empirical study on standard FL benchmark datasets. Our results demonstrate that all three proposed federated Rashomon set definitions offer valuable insights, enabling clients to deploy models that better align with their local data, fairness considerations, and practical requirements.
翻译:Rashomon集合捕获了在经验性能上近乎相同但决策边界可能存在显著差异的模型集合。理解这些模型之间的差异(即其多重性)被认为是实现模型透明度、公平性和鲁棒性的关键步骤,因为它揭示了标准评估指标所掩盖的决策边界不稳定性。然而,现有Rashomon集合与多重性指标的定义均假设中心化学习场景,无法自然扩展到联邦学习(FL)这类去中心化多方参与的环境。在联邦学习中,多个客户端在中央服务器的协调下协作训练模型而无需共享原始数据,这虽然保护了隐私,但同时也因客户端数据分布的异构性和通信约束带来了新的挑战。在此背景下,选择单一最优模型可能导致不同客户端间的预测行为趋同化、放大系统偏见或破坏公平性保证。本研究首次对联邦学习中的Rashomon集合进行了形式化定义:首先,我们将Rashomon集合定义适配至联邦学习场景,区分了三种视角:(I)基于所有客户端聚合统计量定义的全局Rashomon集合;(II)表示t比例客户端局部Rashomon集合交集的t-一致性Rashomon集合;(III)针对各客户端本地分布的特异性Rashomon集合。其次,我们展示了如何在联邦学习的隐私约束下估计标准多重性指标。最后,我们提出了具有多重性感知能力的联邦学习流程,并在标准联邦学习基准数据集上进行了实证研究。实验结果表明,所提出的三种联邦Rashomon集合定义均能提供有价值的洞见,使客户端能够部署更符合其本地数据特性、公平性考量及实际需求的模型。