Scarcity of health care resources could result in the unavoidable consequence of rationing. For example, ventilators are often limited in supply, especially during public health emergencies or in resource-constrained health care settings, such as amid the pandemic of COVID-19. Currently, there is no universally accepted standard for health care resource allocation protocols, resulting in different governments prioritizing patients based on various criteria and heuristic-based protocols. In this study, we investigate the use of reinforcement learning for critical care resource allocation policy optimization to fairly and effectively ration resources. We propose a transformer-based deep Q-network to integrate the disease progression of individual patients and the interaction effects among patients during the critical care resource allocation. We aim to improve both fairness of allocation and overall patient outcomes. Our experiments demonstrate that our method significantly reduces excess deaths and achieves a more equitable distribution under different levels of ventilator shortage, when compared to existing severity-based and comorbidity-based methods in use by different governments. Our source code is included in the supplement and will be released on Github upon publication.
翻译:医疗资源的稀缺可能导致不可避免的配给后果。例如,呼吸机供应往往受限,尤其在突发公共卫生事件或资源受限的医疗环境中,如COVID-19大流行期间。目前,尚无普遍接受的医疗资源分配协议标准,导致不同政府基于多种标准和启发式协议对患者进行优先排序。在本研究中,我们探讨利用强化学习优化重症监护资源分配政策,以实现资源的公平有效配给。我们提出一种基于Transformer的深度Q网络,用于整合个体患者的疾病进展及重症监护资源分配中的患者间交互效应。我们旨在提升分配公平性并改善患者总体预后。实验表明,与不同政府目前采用的基于严重程度和共病的方法相比,我们的方法在不同呼吸机短缺程度下显著减少了超额死亡,并实现了更均衡的资源分布。我们的源代码详见补充材料,并在发表后于Github开源。