We study the problem of agent selection in causal strategic learning under multiple decision makers and address two key challenges that come with it. Firstly, while much of prior work focuses on studying a fixed pool of agents that remains static regardless of their evaluations, we consider the impact of selection procedure by which agents are not only evaluated, but also selected. When each decision maker unilaterally selects agents by maximising their own utility, we show that the optimal selection rule is a trade-off between selecting the best agents and providing incentives to maximise the agents' improvement. Furthermore, this optimal selection rule relies on incorrect predictions of agents' outcomes. Hence, we study the conditions under which a decision maker's optimal selection rule will not lead to deterioration of agents' outcome nor cause unjust reduction in agents' selection chance. To that end, we provide an analytical form of the optimal selection rule and a mechanism to retrieve the causal parameters from observational data, under certain assumptions on agents' behaviour. Secondly, when there are multiple decision makers, the interference between selection rules introduces another source of biases in estimating the underlying causal parameters. To address this problem, we provide a cooperative protocol which all decision makers must collectively adopt to recover the true causal parameters. Lastly, we complement our theoretical results with simulation studies. Our results highlight not only the importance of causal modeling as a strategy to mitigate the effect of gaming, as suggested by previous work, but also the need of a benevolent regulator to enable it.
翻译:我们研究多决策者场景下因果策略学习中的智能体选择问题,并处理随之而来的两个关键挑战。首先,以往多数工作关注的是固定不变的智能体池(无论其评估结果如何),而本文考虑了选择过程的影响——智能体不仅被评估,还会被选中。当每个决策者通过最大化自身效用单方面选择智能体时,我们证明最优选择规则是在选取最优智能体与提供激励以最大化智能体改进之间进行权衡。此外,该最优选择规则依赖于对智能体结果的不准确预测。因此,我们研究了决策者最优选择规则不会导致智能体结果恶化或造成其被选中概率不公平降低的条件。基于此,我们给出了最优选择规则的解析形式,并在对智能体行为作出特定假设的前提下,提出了从观测数据中恢复因果参数的机制。其次,当存在多个决策者时,选择规则之间的相互干扰会引入另一类偏差,影响底层因果参数的估计。为解决这一问题,我们提出了一种协作协议,所有决策者必须集体采用该协议才能恢复真实的因果参数。最后,我们通过仿真研究对理论结果进行了补充。我们的结果不仅强调了因果建模作为缓解博弈效应策略的重要性(与先前研究一致),还揭示了仁慈监管者对于实现这一目标的必要性。