Modern challenges of robustness, fairness, and decision-making in machine learning have led to the formulation of multi-distribution learning (MDL) frameworks in which a predictor is optimized across multiple distributions. We study the calibration properties of MDL to better understand how the predictor performs uniformly across the multiple distributions. Through classical results on decomposing proper scoring losses, we first derive the Bayes optimal rule for MDL, demonstrating that it maximizes the generalized entropy of the associated loss function. Our analysis reveals that while this approach ensures minimal worst-case loss, it can lead to non-uniform calibration errors across the multiple distributions and there is an inherent calibration-refinement trade-off, even at Bayes optimality. Our results highlight a critical limitation: despite the promise of MDL, one must use caution when designing predictors tailored to multiple distributions so as to minimize disparity.
翻译:机器学习中的鲁棒性、公平性和决策制定等现代挑战催生了多分布学习框架的提出,其中预测器需在多个分布上进行优化。本文研究多分布学习的校准特性,以深入理解预测器在多个分布上的一致性能。通过分解适当评分损失的经典结果,我们首先推导出多分布学习的贝叶斯最优规则,证明其能最大化相关损失函数的广义熵。分析表明,尽管该方法能确保最坏情况损失最小化,但可能导致多个分布间的校准误差不均匀,且即使在贝叶斯最优条件下,校准与细化之间也存在固有的权衡关系。我们的结果揭示了一个关键局限:尽管多分布学习具有前景,但在设计面向多个分布的预测器时仍需谨慎,以最小化性能差异。