For applications where multiple stakeholders provide recommendations, a fair consensus ranking must not only ensure that the preferences of rankers are well represented, but must also mitigate disadvantages among socio-demographic groups in the final result. However, there is little empirical guidance on the value or challenges of visualizing and integrating fairness metrics and algorithms into human-in-the-loop systems to aid decision-makers. In this work, we design a study to analyze the effectiveness of integrating such fairness metrics-based visualization and algorithms. We explore this through a task-based crowdsourced experiment comparing an interactive visualization system for constructing consensus rankings, ConsensusFuse, with a similar system that includes visual encodings of fairness metrics and fair-rank generation algorithms, FairFuse. We analyze the measure of fairness, agreement of rankers' decisions, and user interactions in constructing the fair consensus ranking across these two systems. In our study with 200 participants, results suggest that providing these fairness-oriented support features nudges users to align their decision with the fairness metrics while minimizing the tedious process of manually having to amend the consensus ranking. We discuss the implications of these results for the design of next-generation fairness oriented-systems and along with emerging directions for future research.
翻译:在多利益相关者提供推荐的应用中,公平的共识排名不仅必须确保排名者的偏好得到充分体现,还必须减轻最终结果中社会人口群体之间的不利影响。然而,关于将公平性度量和算法可视化并整合到人机协同系统中以辅助决策者的价值或挑战,目前缺乏实证指导。在本工作中,我们设计了一项研究,分析整合此类基于公平性度量的可视化和算法的有效性。我们通过一项基于任务的众包实验来探索这一点,比较了用于构建共识排名的交互式可视化系统ConsensusFuse,以及另一个包含公平性度量视觉编码和公平排名生成算法的类似系统FairFuse。我们分析了这两个系统中构建公平共识排名时的公平性衡量指标、排名者决策的一致性以及用户交互。在200名参与者参与的研究中,结果表明,提供这些公平性导向的支持功能会促使用户将其决策与公平性度量对齐,同时最大程度减少手动修改共识排名的繁琐过程。我们讨论了这些结果对下一代公平性导向系统设计的启示,以及未来研究的新方向。