Optimal and Fair Encouragement Policy Evaluation and Learning

In consequential domains, it is often impossible to compel individuals to take treatment, so that optimal policy rules are merely suggestions in the presence of human non-adherence to treatment recommendations. In these same domains, there may be heterogeneity both in who responds in taking-up treatment, and heterogeneity in treatment efficacy. While optimal treatment rules can maximize causal outcomes across the population, access parity constraints or other fairness considerations can be relevant in the case of encouragement. For example, in social services, a persistent puzzle is the gap in take-up of beneficial services among those who may benefit from them the most. When in addition the decision-maker has distributional preferences over both access and average outcomes, the optimal decision rule changes. We study causal identification, statistical variance-reduced estimation, and robust estimation of optimal treatment rules, including under potential violations of positivity. We consider fairness constraints such as demographic parity in treatment take-up, and other constraints, via constrained optimization. Our framework can be extended to handle algorithmic recommendations under an often-reasonable covariate-conditional exclusion restriction, using our robustness checks for lack of positivity in the recommendation. We develop a two-stage algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds. We illustrate the methods in two case studies based on data from randomized encouragement to enroll in insurance and from pretrial supervised release with electronic monitoring.

翻译：在重大决策领域，往往无法强制个体接受某种干预措施，因此在个体可能不遵守治疗建议的情况下，最优政策规则仅具有建议性质。在这些领域中，个体在接受干预的意愿以及治疗效果方面均可能存在异质性。尽管最优治疗规则可以最大化整个群体的因果结果，但在鼓励性政策中，资源获取的公平性约束或其他公平性考量可能具有重要意义。例如，在社会服务领域，一个长期存在的难题是最可能从有益服务中获益的群体却存在较低的接受率。当决策者同时对服务获取机会和平均结果持有分配偏好时，最优决策规则将发生变化。我们研究了最优治疗规则的因果识别、统计方差缩减估计和稳健估计方法，包括在正假设可能被违反的情况下的处理。我们通过约束优化考虑了治疗接受率的人口统计平等性等公平性约束。我们的框架可以通过引入协变量条件排除限制（该假设在多数情况下合理）并辅以针对推荐缺乏正假设的稳健性检验，扩展到算法推荐场景。我们设计了一种两阶段算法，用于在一般约束条件下求解参数化策略类别，并获得方差敏感的后悔界。最后，我们基于两项随机鼓励实验数据（分别为保险注册实验和审前电子监控释放实验）对方法进行了案例研究验证。