Model-induced distribution shifts (MIDS) occur as previous model outputs pollute new model training sets over generations of models. This is known as model collapse in the case of generative models, and performative prediction or unfairness feedback loops for supervised models. When a model induces a distribution shift, it also encodes its mistakes, biases, and unfairnesses into the ground truth of its data ecosystem. We introduce a framework that allows us to track multiple MIDS over many generations, finding that they can lead to loss in performance, fairness, and minoritized group representation, even in initially unbiased datasets. Despite these negative consequences, we identify how models might be used for positive, intentional, interventions in their data ecosystems, providing redress for historical discrimination through a framework called algorithmic reparation (AR). We simulate AR interventions by curating representative training batches for stochastic gradient descent to demonstrate how AR can improve upon the unfairnesses of models and data ecosystems subject to other MIDS. Our work takes an important step towards identifying, mitigating, and taking accountability for the unfair feedback loops enabled by the idea that ML systems are inherently neutral and objective.
翻译:模型诱导的分布偏移是指前代模型的输出污染了新一代模型的训练集,这种现象在生成模型领域被称为模型崩溃,而在监督学习模型领域则体现为表现性预测或不公平性反馈循环。当模型引发分布偏移时,其错误、偏见和不公平性会编码到数据生态系统的真实标注中。我们提出的框架能够追踪多代模型产生的多种分布偏移,发现即使初始数据集没有偏差,这些偏移仍会导致性能下降、公平性受损以及少数群体代表性降低。尽管存在这些负面影响,我们识别出模型可能被有意用于数据生态系统的正面干预,通过名为算法修复的框架为历史歧视提供补救方案。我们通过为随机梯度下降算法精心筛选具有代表性的训练批次来模拟算法修复干预措施,从而证明算法修复如何改善模型及受其他分布偏移影响的数据生态系统中的不公平现象。本研究朝着识别、缓解并追究因机器学习系统被默认视为中立客观而引发的不公平反馈循环责任迈出了重要一步。