Creating fair AI systems is a complex problem that involves the assessment of context-dependent bias concerns. Existing research and programming libraries express specific concerns as measures of bias that they aim to constrain or mitigate. In practice, one should explore a wide variety of (sometimes incompatible) measures before deciding which ones warrant corrective action, but their narrow scope means that most new situations can only be examined after devising new measures. In this work, we present a mathematical framework that distils literature measures of bias into building blocks, hereby facilitating new combinations to cover a wide range of fairness concerns, such as classification or recommendation differences across multiple multi-value sensitive attributes (e.g., many genders and races, and their intersections). We show how this framework generalizes existing concepts and present frequently used blocks. We provide an open-source implementation of our framework as a Python library, called FairBench, that facilitates systematic and extensible exploration of potential bias concerns.
翻译:构建公平的人工智能系统是一个复杂问题,涉及对情境依赖性偏见问题的评估。现有研究和编程库将特定关切表达为它们旨在约束或缓解的偏见度量。在实践中,人们应在决定哪些度量值得采取纠正措施之前,探索多种(有时相互矛盾的)度量,但其狭窄的适用范围意味着大多数新情况只能在设计新度量后才能被检视。在本研究中,我们提出了一个数学框架,将文献中的偏见度量提炼为构建模块,从而促进新的组合以覆盖广泛的公平性关切,例如跨多个多值敏感属性(如多种性别与种族及其交叉维度)的分类或推荐差异。我们展示了该框架如何推广现有概念,并呈现了常用模块。我们以名为FairBench的Python库形式提供了该框架的开源实现,该库有助于对潜在偏见问题进行系统化且可扩展的探索。