We present BayesPIM, a Bayesian prevalence-incidence mixture model for estimating time- and covariate-dependent disease incidence from screening and surveillance data. The method is particularly suited to settings where some individuals may have the disease at baseline, baseline tests may be missing or incomplete, and the screening test has imperfect test sensitivity. This setting was present in data from high-risk colorectal cancer (CRC) surveillance through colonoscopy, where adenomas, precursors of CRC, were already present at baseline and remained undetected due to imperfect test sensitivity. By including covariates, the model can quantify heterogeneity in disease risk, thereby informing personalized screening strategies. Internally, BayesPIM uses a Metropolis-within-Gibbs sampler with data augmentation and weakly informative priors on the incidence and prevalence model parameters. In simulations based on the real-world CRC surveillance data, we show that BayesPIM estimates model parameters without bias while handling latent prevalence and imperfect test sensitivity. However, informative priors on the test sensitivity are needed to stabilize estimation and mitigate non-convergence issues. We also show how conditioning incidence and prevalence estimates on covariates explains heterogeneity in adenoma risk and how model fit is assessed using information criteria and a non-parametric estimator.
翻译:我们提出了BayesPIM,一种贝叶斯患病率-发病率混合模型,用于从筛查和监测数据中估计时间和协变量依赖的疾病发病率。该方法特别适用于以下场景:部分个体在基线时可能已患病,基线检测可能存在缺失或不完整,且筛查测试的灵敏度不完美。这种场景存在于通过结肠镜进行的高风险结直肠癌(CRC)监测数据中,其中CRC的前体——腺瘤在基线时已存在,但由于检测灵敏度不完美而未被发现。通过纳入协变量,该模型能够量化疾病风险的异质性,从而为个性化筛查策略提供依据。在内部,BayesPIM采用了一种结合数据增强的Metropolis-within-Gibbs采样器,并对发病率和患病率模型参数使用了弱信息先验。在基于真实世界CRC监测数据的模拟中,我们表明BayesPIM能够无偏地估计模型参数,同时处理潜在患病率和不完美的检测灵敏度。然而,需要对检测灵敏度设置信息性先验以稳定估计并缓解不收敛问题。我们还展示了如何通过协变量条件化发病率和患病率估计来解释腺瘤风险的异质性,以及如何使用信息准则和非参数估计量评估模型拟合度。