Toward Operationalizing Pipeline-aware ML Fairness: A Research Agenda for Developing Practical Guidelines and Tools

While algorithmic fairness is a thriving area of research, in practice, mitigating issues of bias often gets reduced to enforcing an arbitrarily chosen fairness metric, either by enforcing fairness constraints during the optimization step, post-processing model outputs, or by manipulating the training data. Recent work has called on the ML community to take a more holistic approach to tackle fairness issues by systematically investigating the many design choices made through the ML pipeline, and identifying interventions that target the issue's root cause, as opposed to its symptoms. While we share the conviction that this pipeline-based approach is the most appropriate for combating algorithmic unfairness on the ground, we believe there are currently very few methods of \emph{operationalizing} this approach in practice. Drawing on our experience as educators and practitioners, we first demonstrate that without clear guidelines and toolkits, even individuals with specialized ML knowledge find it challenging to hypothesize how various design choices influence model behavior. We then consult the fair-ML literature to understand the progress to date toward operationalizing the pipeline-aware approach: we systematically collect and organize the prior work that attempts to detect, measure, and mitigate various sources of unfairness through the ML pipeline. We utilize this extensive categorization of previous contributions to sketch a research agenda for the community. We hope this work serves as the stepping stone toward a more comprehensive set of resources for ML researchers, practitioners, and students interested in exploring, designing, and testing pipeline-oriented approaches to algorithmic fairness.

翻译：尽管算法公平性是一个蓬勃发展的研究领域，但在实践中，缓解偏差问题往往被简化为强制执行任意选择的公平性度量——无论是在优化阶段施加公平性约束、对模型输出进行后处理，还是通过操纵训练数据。近期工作呼吁机器学习社区采取更全面的方法来处理公平性问题，即系统性审视机器学习流水线中涉及的多项设计选择，并针对问题的根本原因而非其症状进行干预。尽管我们认同这种基于流水线的方法是在实际场景中对抗算法不公平性的最恰当路径，但认为目前将这一方法“可操作化”的手段极为匮乏。基于我们作为教育者和实践者的经验，我们首先证明：若缺乏清晰的指南和工具包，即使是具备专业机器学习知识的个体，也难以假设不同设计选择如何影响模型行为。随后，我们查阅公平机器学习文献，以了解当前在实现流水线感知方法可操作化方面的进展：我们系统性地收集并整理了先前通过机器学习流水线检测、测量及缓解不公平性来源的工作。借助对既有贡献的广泛分类，我们勾勒出一份面向社区的研究议程。希望本文能成为机器学习研究者、实践者及学生探索、设计和测试面向流水线的算法公平性方法的更全面资源体系的奠基石。