How Global Calibration Strengthens Multiaccuracy

Multiaccuracy and multicalibration are multigroup fairness notions for prediction that have found numerous applications in learning and computational complexity. They can be achieved from a single learning primitive: weak agnostic learning. Here we investigate the power of multiaccuracy as a learning primitive, both with and without the additional assumption of calibration. We find that multiaccuracy in itself is rather weak, but that the addition of global calibration (this notion is called calibrated multiaccuracy) boosts its power substantially, enough to recover implications that were previously known only assuming the stronger notion of multicalibration. We give evidence that multiaccuracy might not be as powerful as standard weak agnostic learning, by showing that there is no way to post-process a multiaccurate predictor to get a weak learner, even assuming the best hypothesis has correlation $1/2$. Rather, we show that it yields a restricted form of weak agnostic learning, which requires some concept in the class to have correlation greater than $1/2$ with the labels. However, by also requiring the predictor to be calibrated, we recover not just weak, but strong agnostic learning. A similar picture emerges when we consider the derivation of hardcore measures from predictors satisfying multigroup fairness notions. On the one hand, while multiaccuracy only yields hardcore measures of density half the optimal, we show that (a weighted version of) calibrated multiaccuracy achieves optimal density. Our results yield new insights into the complementary roles played by multiaccuracy and calibration in each setting. They shed light on why multiaccuracy and global calibration, although not particularly powerful by themselves, together yield considerably stronger notions.

翻译：多准确性与多校准性是预测中的多群体公平性概念，在学习和计算复杂性领域已得到广泛应用。它们可通过单一学习原语实现：弱不可知学习。本文研究了多准确性作为学习原语的能力，包括在有无额外校准假设两种情况下的表现。我们发现多准确性本身相当弱，但引入全局校准（该概念称为校准多准确性）能显著增强其能力，足以恢复先前仅在更强概念——多校准性假设下已知的推论。我们通过证明即使最佳假设具有$1/2$相关性，也无法通过后处理多准确性预测器获得弱学习器，表明多准确性可能不如标准弱不可知学习强大。相反，我们证明它产生一种受限的弱不可知学习形式，要求类别中某些概念与标签的相关性大于$1/2$。然而，通过同时要求预测器具有校准性，我们不仅能恢复弱不可知学习，还能实现强不可知学习。当考虑从满足多群体公平性概念的预测器推导硬核度量时，也出现类似图景。一方面，虽然多准确性仅能产生密度为最优值一半的硬核度量，但我们证明（加权版本的）校准多准确性可实现最优密度。我们的研究结果揭示了多准确性与校准在不同场景中的互补作用，阐明了为何多准确性与全局校准各自虽不特别强大，但结合后能产生显著更强的概念。