The success of many healthcare programs depends on participants' adherence. We consider the problem of scheduling interventions in low resource settings (e.g., placing timely support calls from health workers) to increase adherence and/or engagement. Past works have successfully developed several classes of Restless Multi-armed Bandit (RMAB) based solutions for this problem. Nevertheless, all past RMAB approaches assume that the participants' behaviour follows the Markov property. We demonstrate significant deviations from the Markov assumption on real-world data on a maternal health awareness program from our partner NGO, ARMMAN. Moreover, we extend RMABs to continuous state spaces, a previously understudied area. To tackle the generalised non-Markovian RMAB setting we (i) model each participant's trajectory as a time-series, (ii) leverage the power of time-series forecasting models to learn complex patterns and dynamics to predict future states, and (iii) propose the Time-series Arm Ranking Index (TARI) policy, a novel algorithm that selects the RMAB arms that will benefit the most from an intervention, given our future state predictions. We evaluate our approach on both synthetic data, and a secondary analysis on real data from ARMMAN, and demonstrate significant increase in engagement compared to the SOTA, deployed Whittle index solution. This translates to 16.3 hours of additional content listened, 90.8% more engagement drops prevented, and reaching more than twice as many high dropout-risk beneficiaries.
翻译:许多医疗项目的成功取决于参与者的依从性。我们研究在资源匮乏环境中安排干预措施(例如,由卫生工作者及时拨打支持电话)以提高依从性和/或参与度的问题。过去的工作已成功开发出基于休止多臂老虎机(RMAB)的多类解决方案。然而,所有现有的RMAB方法均假设参与者的行为遵循马尔可夫性质。我们在合作伙伴非政府组织ARMMAN的孕产妇健康意识项目的真实数据上,发现了与马尔可夫假设的显著偏离。此外,我们将RMAB扩展到连续状态空间——这一领域此前研究不足。为应对广义的非马尔可夫RMAB场景,我们:(i) 将每位参与者的轨迹建模为时间序列;(ii) 利用时间序列预测模型学习复杂模式与动态,以预测未来状态;(iii) 提出时间序列臂排序指数(TARI)策略——一种根据未来状态预测,选择能从干预中获益最大的RMAB臂的新算法。我们在合成数据以及对ARMMAN真实数据的二次分析中评估了该方法,结果显示,与已部署的Whittle指数基准方案相比,参与度显著提升。这相当于额外增加了16.3小时的收听内容,预防了90.8%的参与度下降,并覆盖了超过两倍的高辍学风险受益人。