Algorithm fairness has become a central problem for the broad adoption of artificial intelligence. Although the past decade has witnessed an explosion of excellent work studying algorithm biases, achieving fairness in real-world AI production systems has remained a challenging task. Most existing works fail to excel in practical applications since either they have conflicting measurement techniques and/ or heavy assumptions, or require code-access of the production models, whereas real systems demand an easy-to-implement measurement framework and a systematic way to correct the detected sources of bias. In this paper, we leverage recent advances in causal inference and interpretable machine learning to present an algorithm-agnostic framework (MIIF) to Measure, Interpret, and Improve the Fairness of an algorithmic decision. We measure the algorithm bias using randomized experiments, which enables the simultaneous measurement of disparate treatment, disparate impact, and economic value. Furthermore, using modern interpretability techniques, we develop an explainable machine learning model which accurately interprets and distills the beliefs of a blackbox algorithm. Altogether, these techniques create a simple and powerful toolset for studying algorithm fairness, especially for understanding the cost of fairness in practical applications like e-commerce and targeted advertising, where industry A/B testing is already abundant.
翻译:算法公平性已成为人工智能广泛应用的核心问题。尽管过去十年涌现了大量关于算法偏差的优秀研究成果,但在实际AI生产系统中实现公平性仍是充满挑战的任务。现有方法大多难以在实际应用中取得良好效果:要么存在互相冲突的测量技术和/或严苛假设,要么需要获取生产模型的代码访问权限,而真实系统需要易于实施的测量框架和系统性的偏差源修正方案。本文利用因果推断与可解释机器学习的最新进展,提出了一种与算法无关的框架(MIIF),用于测量、解释和改善算法决策的公平性。我们通过随机实验测量算法偏差,可同时实现差别对待、差别影响和经济价值的并行测量。进一步地,借助现代可解释性技术,我们开发出能够准确解释和提炼黑箱算法信念的可解释机器学习模型。这些技术的协同作用构建了一套研究算法公平性的简洁且强大的工具集,尤其适用于理解电子商务、定向广告等已广泛开展行业A/B测试的实际应用中公平性的代价问题。