Disaggregated evaluation is a central task in AI fairness assessment, with the goal to measure an AI system's performance across different subgroups defined by combinations of demographic or other sensitive attributes. The standard approach is to stratify the evaluation data across subgroups and compute performance metrics separately for each group. However, even for moderately-sized evaluation datasets, sample sizes quickly get small once considering intersectional subgroups, which greatly limits the extent to which intersectional groups are considered in many disaggregated evaluations. In this work, we introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups. We also provide corresponding inference strategies for constructing confidence intervals and explore how goodness-of-fit testing can yield insight into the structure of fairness-related harms experienced by intersectional groups. We evaluate our approach on two publicly available datasets, and several variants of semi-synthetic data. The results show that our method is considerably more accurate than the standard approach, especially for small subgroups, and goodness-of-fit testing helps identify the key factors that drive differences in performance.
翻译:在AI公平性评估中,分解式评估是一项核心任务,旨在衡量AI系统在不同人口统计学或其他敏感属性组合定义的子群体上的性能表现。标准方法是按子群体对评估数据进行分层,并分别计算每个群体的性能指标。然而,即便是中等规模的评估数据集,在考虑交叉子群体时样本量也会迅速减少,这极大限制了多数分解式评估中对交叉群体的研究深度。本研究提出了一种面向分解式评估的结构化回归方法,我们证明即使在极小子群体中该方法也能获得可靠的系统性能估计。同时我们提供了相应的置信区间构建推断策略,并探讨了拟合优度检验如何揭示交叉群体所遭受公平性相关危害的结构特征。我们在两个公开数据集及多个半合成数据变体上评估了该方法。结果表明,我们的方法比标准方法具有显著更高的准确性,尤其在小子群体中表现更为突出,且拟合优度检验有助于识别导致性能差异的关键因素。