Often in prediction tasks, the predictive model itself can influence the distribution of the target variable, a phenomenon termed performative prediction. Generally, this influence stems from strategic actions taken by stakeholders with a vested interest in predictive models. A key challenge that hinders the widespread adaptation of performative prediction in machine learning is that practitioners are generally unaware of the social impacts of their predictions. To address this gap, we propose a methodology for learning the distribution map that encapsulates the long-term impacts of predictive models on the population. Specifically, we model agents' responses as a cost-adjusted utility maximization problem and propose estimates for said cost. Our approach leverages optimal transport to align pre-model exposure (ex ante) and post-model exposure (ex post) distributions. We provide a rate of convergence for this proposed estimate and assess its quality through empirical demonstrations on a credit-scoring dataset.
翻译:在预测任务中,预测模型本身常常能够影响目标变量的分布,这种现象被称为表演性预测。通常,这种影响源于对预测模型具有既定利益的相关方所采取的战略行动。阻碍表演性预测在机器学习中广泛适应的一个关键挑战是,从业者通常不了解其预测的社会影响。为弥补这一差距,我们提出了一种学习分布映射的方法,该映射封装了预测模型对人群的长期影响。具体而言,我们将智能体的响应建模为一个成本调整的效用最大化问题,并提出了所述成本的估计方法。我们的方法利用最优传输来对齐模型暴露前(事前)和模型暴露后(事后)的分布。我们为这一提出的估计提供了收敛速率,并通过在信用评分数据集上的实证演示评估了其质量。