In many real-world situations, data is distributed across multiple self-interested agents. These agents can collaborate to build a machine learning model based on data from multiple agents, potentially reducing the error each experiences. However, sharing models in this way raises questions of fairness: to what extent can the error experienced by one agent be significantly lower than the error experienced by another agent in the same coalition? In this work, we consider two notions of fairness that each may be appropriate in different circumstances: "egalitarian fairness" (which aims to bound how dissimilar error rates can be) and "proportional fairness" (which aims to reward players for contributing more data). We similarly consider two common methods of model aggregation, one where a single model is created for all agents (uniform), and one where an individualized model is created for each agent. For egalitarian fairness, we obtain a tight multiplicative bound on how widely error rates can diverge between agents collaborating (which holds for both aggregation methods). For proportional fairness, we show that the individualized aggregation method always gives a small player error that is upper bounded by proportionality. For uniform aggregation, we show that this upper bound is guaranteed for any individually rational coalition (where no player wishes to leave to do local learning).
翻译:在许多现实场景中,数据分布在多个存在自身利益的智能体之间。这些智能体可以协作构建基于多源数据的机器学习模型,从而有可能降低各自所经历的误差。然而,这种模型共享方式引发了公平性问题:在同一联盟中,一个智能体所经历的误差可能在多大程度上显著低于另一个智能体?在本工作中,我们考虑了两种分别适用于不同场景的公平性概念:"平等主义公平性"(旨在约束误差率的差异范围)和"比例公平性"(旨在奖励贡献更多数据的参与者)。我们同样考虑了两种常见的模型聚合方法:一种是为所有智能体创建单一模型(统一聚合),另一种是为每个智能体创建个性化模型(个性化聚合)。对于平等主义公平性,我们得到了一个关于协作智能体间误差率差异范围的紧乘性界(该界对两种聚合方法均成立)。对于比例公平性,我们证明了个性化聚合方法总能保证小型智能体的误差被比例性上界约束。对于统一聚合方法,我们证明该上界对任何个体理性联盟(其中没有智能体愿意退出以进行本地学习)均成立。