There is limited understanding of how dietary behaviors cluster together and influence cardiometabolic health at a population level in Puerto Rico. Data availability is scarce, particularly outside of urban areas, and is often limited to non-probability sample (NPS) data where sample inclusion mechanisms are unknown. In order to generalize results to the broader Puerto Rican population, adjustments are necessary to account for selection bias but are difficult to implement for NPS data. Although Bayesian latent class models enable summaries of dietary behavior variables through underlying patterns, they have not yet been adapted to the NPS setting. We propose a novel Weighted Overfitted Latent Class Analysis for Non-probability samples (WOLCAN). WOLCAN utilizes a quasi-randomization framework to (1) model pseudo-weights for an NPS using Bayesian additive regression trees (BART) and a reference probability sample, and (2) integrate the pseudo-weights within a weighted pseudo-likelihood approach for Bayesian latent class analysis, while propagating pseudo-weight uncertainty into parameter estimation. A stacked sample approach is used to allow shared individuals between the NPS and the reference sample. We evaluate model performance through simulations and apply WOLCAN to data from the Puerto Rico Observational Study of Psychosocial, Environmental, and Chronic Disease Trends (PROSPECT). We identify dietary behavior patterns for adults in Puerto Rico aged 30 to 75 and examine their associations with type 2 diabetes, hypertension, and hypercholesterolemia. Our findings suggest that an out-of-home eating pattern is associated with a higher likelihood of these cardiometabolic outcomes compared to a nutrition-sensitive pattern. WOLCAN effectively reveals generalizable dietary behavior patterns and demonstrates relevant applications in studying diet-disease relationships.
翻译:目前对波多黎各人群饮食行为如何聚类并影响心脏代谢健康的认识有限。数据获取困难,在城市以外地区尤为匮乏,且通常仅限于样本纳入机制未知的非概率样本数据。为将研究结果推广至更广泛的波多黎各人群,需对选择偏倚进行校正,但此类校正在非概率样本数据中难以实施。尽管贝叶斯潜在类别模型能通过潜在模式总结饮食行为变量,但其尚未被应用于非概率样本场景。本研究提出一种创新的非概率样本加权过参数化潜在类别分析方法。该方法采用准随机化框架:(1)利用贝叶斯加性回归树和参考概率样本为非概率样本建立伪权重模型;(2)将伪权重整合至贝叶斯潜在类别分析的加权伪似然估计框架中,并将伪权重的不确定性传递至参数估计过程。通过堆叠样本策略处理非概率样本与参考样本间的重叠个体。我们通过模拟实验评估模型性能,并将该方法应用于波多黎各心理社会、环境与慢性病趋势观察研究数据。研究识别了30至75岁波多黎各成年人的饮食行为模式,并检验其与2型糖尿病、高血压及高胆固醇血症的关联。研究结果表明,与营养敏感型饮食模式相比,外食型饮食模式与这些心脏代谢结局的发生风险显著相关。该方法能有效揭示可推广的饮食行为模式,为研究饮食与疾病关系提供了重要应用工具。