Restless and collapsing bandits are often used to model budget-constrained resource allocation in settings where arms have action-dependent transition probabilities, such as the allocation of health interventions among patients. However, state-of-the-art Whittle-index-based approaches to this planning problem either do not consider fairness among arms, or incentivize fairness without guaranteeing it. We thus introduce ProbFair, a probabilistically fair policy that maximizes total expected reward and satisfies the budget constraint while ensuring a strictly positive lower bound on the probability of being pulled at each timestep. We evaluate our algorithm on a real-world application, where interventions support continuous positive airway pressure (CPAP) therapy adherence among patients, as well as on a broader class of synthetic transition matrices. We find that ProbFair preserves utility while providing fairness guarantees.
翻译:不安分和崩溃赌博机常被用于建模臂具有动作依赖转移概率的预算约束资源分配问题,例如患者间健康干预措施的分配。然而,针对该规划问题的最先进Whittle指数方法要么未考虑臂间的公平性,要么仅激励公平性却无法提供保证。为此,我们提出ProbFair——一种概率公平策略,它在每个时间步确保臂被选中概率严格为正的下界的同时,最大化期望总收益并满足预算约束。我们在医疗干预辅助患者持续气道正压通气(CPAP)治疗依从性的实际应用场景,以及更广泛的合成转移矩阵类别上评估了该算法。实验表明,ProbFair在保持效用的同时提供了公平性保证。