Algorithmic assignment of refugees and asylum seekers to locations within host countries has gained attention in recent years, with implementations in the US and Switzerland. These approaches use data on past arrivals to generate machine learning models that can be used (along with assignment algorithms) to match families to locations, with the goal of maximizing a policy-relevant integration outcome such as employment status after a certain duration. Existing implementations and research train models to predict the policy outcome directly, and use these predictions in the assignment procedure. However, the merits of this approach, particularly in non-stationary settings, has not been previously explored. This study proposes and compares three different modeling strategies: the standard approach described above, an approach that uses newer data and proxy outcomes, and a hybrid approach. We show that the hybrid approach is robust to both distribution shift and weak proxy relationships -- the failure points of the other two methods, respectively. We compare these approaches empirically using data on asylum seekers in the Netherlands. Surprisingly, we find that both the proxy and hybrid approaches out-perform the standard approach in practice. These insights support the development of a real-world recommendation tool currently used by NGOs and government agencies.
翻译:近年来,将难民和寻求庇护者分配至收容国内特定地点的算法化方法受到关注,并在美国和瑞士得到实际应用。这些方法利用历史抵达数据生成机器学习模型,可结合分配算法将家庭匹配至具体地点,旨在最大化与政策相关的融合成效指标(如特定时长后的就业状态)。现有实施与研究直接训练模型预测政策结果,并将这些预测用于分配流程。然而,该方法的优势(尤其是在非平稳环境中)此前尚未得到探讨。本研究提出并比较三种不同建模策略:上述标准方法、基于新数据与代理结果的替代方法,以及混合方法。研究表明,混合方法对分布偏移和弱代理关系(分别为另外两种方法的失效点)均具有稳健性。我们通过荷兰寻求庇护者数据对这些方法进行实证比较。令人惊讶的是,实际应用中代理方法和混合方法的表现均优于标准方法。这些发现为目前被非政府组织和政府机构使用的现实推荐工具开发提供了支撑。