A Bayesian Prevalence Incidence Cure model for estimating survival using Electronic Health Records with incomplete baseline diagnoses

Retrospective cohorts can be extracted from Electronic Health Records (EHR) to study prevalence, time until disease or event occurrence and cure proportion in real world scenarios. However, EHR are collected for patient care rather than research, so typically have complexities, such as patients with missing baseline disease status. Prevalence-Incidence (PI) models, which use a two-component mixture model to account for this missing data, have been proposed. However, PI models are biased in settings in which some individuals will never experience the endpoint (they are 'cured'). To address this, we propose a Prevalence Incidence Cure (PIC) model, a 3 component mixture model that combines the PI model framework with a cure model. Our PIC model enables estimation of the prevalence, time-to-incidence, and the cure proportion, and allows for covariates to affect these. We adopt a Bayesian inference approach, and focus on the interpretability of the prior. We show in a simulation study that the PIC model has smaller bias than a PI model for the survival probability; and compare inference under vague, informative and misspecified priors. We illustrate our model using a dataset of 1964 patients undergoing treatment for Diabetic Macular Oedema, demonstrating improved fit under the PIC model.

翻译：回顾性队列可从电子健康记录中提取，用于研究真实世界场景中的患病率、疾病或事件发生时间以及治愈比例。然而，电子健康记录是为患者护理而非研究目的收集的，因此通常存在复杂性，例如患者基线疾病状态缺失。已有研究提出患病率-发病率模型，该模型使用双组分混合模型来处理此类缺失数据。然而，在部分个体永远不会经历终点事件（即他们被'治愈'）的情况下，患病率-发病率模型存在偏差。为解决此问题，我们提出了一种患病率-发病率治愈模型，这是一种将患病率-发病率模型框架与治愈模型相结合的三组分混合模型。我们的患病率-发病率治愈模型能够估计患病率、发病时间以及治愈比例，并允许协变量影响这些参数。我们采用贝叶斯推断方法，并重点关注先验分布的可解释性。我们在模拟研究中表明，对于生存概率，患病率-发病率治愈模型比患病率-发病率模型具有更小的偏差；并比较了在模糊、信息性和错误设定先验下的推断结果。我们使用一个包含1964名接受糖尿病性黄斑水肿治疗的患者的数据库来说明我们的模型，证明了患病率-发病率治愈模型具有更好的拟合效果。

相关内容

健康

关注 27

健康是指一个人在身体、精神和社会等方面都处于良好的状态。健康包括两个方面的内容：

一是主要脏器无疾病，身体形态发育良好，体形均匀，人体各系统具有良好的生理功能，有较强的身体活动能力和劳动能力，这是对健康最基本的要求；

二是对疾病的抵抗能力较强，能够适应环境变化，各种生理刺激以及致病因素对身体的作用。传统的健康观是“无病即健康”，现代人的健康观是整体健康，世界卫生组织提出“健康不仅是躯体没有疾病，还要具备心理健康、社会适应良好和有道德”。因此，现代人的健康内容包括：躯体健康、心理健康、心灵健康、社会健康、智力健康、道德健康、环境健康等。健康是人的基本权利。健康是人生的第一财富。

利用表示学习推动多机构电子健康记录数据研究

专知会员服务

16+阅读 · 2025年2月17日

【牛津大学博士论文】面向电子健康记录的深度学习:风险预测、可解释性和不确定性，200页pdf

专知会员服务

46+阅读 · 2023年7月18日

【普林斯顿博士论文】结构化生物医学数据的概率模型，130页pdf

专知会员服务

24+阅读 · 2023年3月12日

【巴黎理工博士论文】《面向不规则医疗时间戳数据的基于深度学习的多模态优化方法》2022最新148页博士论文

专知会员服务

35+阅读 · 2022年8月15日