Population-level heterogeneity is ubiquitous in biomedical data, where differences across demographic or clinical subgroups can substantially alter risk patterns. For example, in intensive care unit (ICU) studies, the mortality risk associated with specific admission diagnoses can vary across ethnic groups. Existing approaches for detecting risk heterogeneity are often sensitive to baseline model misspecification and regularization bias, both of which commonly arise in practice. In this paper, we propose a robust framework for inferring risk heterogeneity between two populations using Neyman orthogonality, which yields estimators that are locally insensitive to nuisance parameter estimation error. The proposed estimator is consistent and asymptotically normal, and simulation studies demonstrate that in finite samples our method substantially reduces bias and improves inferential stability compared with standard likelihood-based approaches. In an application to the eICU Collaborative Research Database, our method reveals clinically meaningful ethnicity-specific heterogeneity in admission diagnoses for in-hospital mortality that standard likelihood-based methods fail to detect.
翻译:人群层面的异质性在生物医学数据中普遍存在,不同人口统计或临床亚组间的差异可能显著改变风险模式。例如,在重症监护病房(ICU)研究中,特定入院诊断对应的死亡风险可能因种族群体而异。现有检测风险异质性的方法通常对基准模型误设和正则化偏差敏感,而这两者在实际中普遍存在。本文提出一个基于奈曼正交性的稳健框架来推断两个群体间的风险异质性,该框架产生的估计量对干扰参数估计误差具有局部不敏感性。所提出的估计量具有一致性和渐近正态性,模拟研究表明,有限样本下我们的方法相比标准似然方法显著降低了偏差并提升了推断稳定性。在eICU协作研究数据库的应用中,我们的方法揭示了标准似然方法无法检测到的、具有临床意义的入院诊断种族特异性异质性对住院死亡率的影响。