Existing work in fairness auditing assumes that each audit is performed independently. In this paper, we consider multiple agents working together, each auditing the same platform for different tasks. Agents have two levers: their collaboration strategy, with or without coordination beforehand, and their strategy for sampling appropriate data points. We theoretically compare the interplay of these levers. Our main findings are that (i) collaboration is generally beneficial for accurate audits, (ii) basic sampling methods often prove to be effective, and (iii) counter-intuitively, extensive coordination on queries often deteriorates audits accuracy as the number of agents increases. Experiments on three large datasets confirm our theoretical results. Our findings motivate collaboration during fairness audits of platforms that use ML models for decision-making.
翻译:现有的公平性审计研究假设每次审计独立进行。本文考虑多个智能体协同工作,各自对同一平台的不同任务进行审计。智能体拥有两种策略杠杆:协作策略(是否预先协调)与数据点采样策略。我们通过理论分析比较了这些策略杠杆的相互作用。主要发现包括:(i)协作通常有利于提升审计准确性;(ii)基础采样方法往往能产生有效结果;(iii)违反直觉的是,随着智能体数量增加,对查询进行过度协调反而会降低审计准确性。在三个大型数据集上的实验验证了我们的理论结果。这些发现为使用机器学习模型进行决策的平台在公平性审计中开展协作提供了理论依据。