Evaluating COVID-19 vaccine allocation policies using Bayesian $m$-top exploration

Individual-based epidemiological models support the study of fine-grained preventive measures, such as tailored vaccine allocation policies, in silico. As individual-based models are computationally intensive, it is pivotal to identify optimal strategies within a reasonable computational budget. Moreover, due to the high societal impact associated with the implementation of preventive strategies, uncertainty regarding decisions should be communicated to policy makers, which is naturally embedded in a Bayesian approach. We present a novel technique for evaluating vaccine allocation strategies using a multi-armed bandit framework in combination with a Bayesian anytime $m$-top exploration algorithm. $m$-top exploration allows the algorithm to learn $m$ policies for which it expects the highest utility, enabling experts to inspect this small set of alternative strategies, along with their quantified uncertainty. The anytime component provides policy advisors with flexibility regarding the computation time and the desired confidence, which is important as it is difficult to make this trade-off beforehand. We consider the Belgian COVID-19 epidemic using the individual-based model STRIDE, where we learn a set of vaccination policies that minimize the number of infections and hospitalisations. Through experiments we show that our method can efficiently identify the $m$-top policies, which is validated in a scenario where the ground truth is available. Finally, we explore how vaccination policies can best be organised under different contact reduction schemes. Through these experiments, we show that the top policies follow a clear trend regarding the prioritised age groups and assigned vaccine type, which provides insights for future vaccination campaigns.

翻译：基于个体的流行病学模型支持在计算机上研究精细化的预防措施，例如定制化的疫苗分配政策。由于基于个体的模型计算强度大，因此在合理的计算预算内识别最优策略至关重要。此外，由于实施预防策略具有很高的社会影响，决策中的不确定性应传达给政策制定者，而这一不确定性自然蕴含在贝叶斯方法中。我们提出了一种新颖的技术，用于评估疫苗分配策略，该技术采用多臂赌博机框架并结合贝叶斯任意时间$m$-顶探索算法。$m$-顶探索使算法能够学习其预期效用最高的$m$个策略，从而让专家可以审查这组少量备选策略及其量化的不确定性。任意时间组件为政策顾问提供了计算时间和期望置信度方面的灵活性，这在事先难以权衡此取舍时尤为重要。我们以比利时COVID-19疫情为例，使用基于个体的模型STRIDE，学习一组能够最小化感染人数和住院人数的疫苗接种政策。通过实验，我们证明该方法能够高效识别$m$个最优策略，并在已知真实情况的场景中得到验证。最后，我们探讨了在不同接触减少方案下如何最优地组织疫苗接种政策。通过这些实验，我们展示了最优政策在优先接种年龄组和分配疫苗类型方面遵循清晰趋势，这为未来的疫苗接种运动提供了见解。