Many machine learning applications predict individual probabilities, such as the likelihood that a person develops a particular illness. Since these probabilities are unknown, a key question is how to address situations in which different models trained on the same dataset produce varying predictions for certain individuals. This issue is exemplified by the model multiplicity (MM) phenomenon, where a set of comparable models yield inconsistent predictions. Roth, Tolbert, and Weinstein recently introduced a reconciliation procedure, the Reconcile algorithm, to address this problem. Given two disagreeing models, the algorithm leverages their disagreement to falsify and improve at least one of the models. In this paper, we empirically analyze the Reconcile algorithm using five widely-used fairness datasets: COMPAS, Communities and Crime, Adult, Statlog (German Credit Data), and the ACS Dataset. We examine how Reconcile fits within the model multiplicity literature and compare it to existing MM solutions, demonstrating its effectiveness. We also discuss potential improvements to the Reconcile algorithm theoretically and practically. Finally, we extend the Reconcile algorithm to the setting of causal inference, given that different competing estimators can again disagree on specific causal average treatment effect (CATE) values. We present the first extension of the Reconcile algorithm in causal inference, analyze its theoretical properties, and conduct empirical tests. Our results confirm the practical effectiveness of Reconcile and its applicability across various domains.
翻译:许多机器学习应用预测个体概率,例如某人罹患特定疾病的可能性。由于这些概率未知,一个关键问题是如何处理不同模型在同一数据集上训练后对某些个体产生不同预测的情况。这一问题以模型多样性现象为例,即一组可比较模型产生不一致的预测。Roth、Tolbert和Weinstein近期提出了一种调和程序——Reconcile算法——来解决此问题。给定两个存在分歧的模型,该算法利用它们之间的分歧来证伪并改进至少其中一个模型。本文使用五个广泛使用的公平性数据集:COMPAS、Communities and Crime、Adult、Statlog(德国信用数据)和ACS数据集,对Reconcile算法进行实证分析。我们探讨了Reconcile在模型多样性文献中的定位,并将其与现有MM解决方案进行比较,证明了其有效性。我们还从理论和实践角度讨论了Reconcile算法的潜在改进方向。最后,鉴于不同竞争性估计器可能再次在特定因果平均处理效应值上存在分歧,我们将Reconcile算法扩展到因果推断场景。我们提出了Reconcile算法在因果推断中的首次扩展,分析了其理论性质,并进行了实证检验。我们的结果证实了Reconcile的实际有效性及其跨领域的适用性。