Existing statistical methods can estimate a policy, or a mapping from covariates to decisions, which can then instruct decision makers (e.g., whether to administer hypotension treatment based on covariates blood pressure and heart rate). There is great interest in using such data-driven policies in healthcare. However, it is often important to explain to the healthcare provider, and to the patient, how a new policy differs from the current standard of care. This end is facilitated if one can pinpoint the aspects of the policy (i.e., the parameters for blood pressure and heart rate) that change when moving from the standard of care to the new, suggested policy. To this end, we adapt ideas from Trust Region Policy Optimization (TRPO). In our work, however, unlike in TRPO, the difference between the suggested policy and standard of care is required to be sparse, aiding with interpretability. This yields ``relative sparsity," where, as a function of a tuning parameter, $\lambda$, we can approximately control the number of parameters in our suggested policy that differ from their counterparts in the standard of care (e.g., heart rate only). We propose a criterion for selecting $\lambda$, perform simulations, and illustrate our method with a real, observational healthcare dataset, deriving a policy that is easy to explain in the context of the current standard of care. Our work promotes the adoption of data-driven decision aids, which have great potential to improve health outcomes.
翻译:现有统计方法可估计策略,即从协变量到决策的映射,进而指导决策者(例如,根据协变量血压和心率决定是否实施低血压治疗)。此类数据驱动策略在医疗领域中具有巨大应用前景。然而,向医疗服务提供者及患者解释新策略与现有标准治疗之间的差异往往至关重要。若能精准定位策略中从标准治疗转向新建议策略时发生变化的特征(即血压与心率的参数),可有效促进这一目标。为此,我们借鉴了信任区域策略优化(TRPO)的思想。但与TRPO不同,本研究中建议策略与标准治疗之间的差异需具有稀疏性,以提升可解释性。这实现了"相对稀疏性":通过调节参数$\lambda$,可近似控制建议策略中与标准治疗对应参数(例如仅心率)存在差异的参数量。我们提出了一种选择$\lambda$的准则,开展仿真实验,并基于真实观察性医疗数据集验证方法,最终推导出易于在当前标准治疗背景下解释的策略。本工作推动了数据驱动决策辅助工具的应用,此类工具有望显著改善健康结局。