Polygenic hazard score (PHS) models designed for European ancestry (EUR) individuals provide ample information regarding survival risk discrimination. Incorporating such information can improve the performance of risk discrimination in an internal small-sized non-EUR cohort. However, given that external EUR-based model and internal individual-level data come from different populations, ignoring population heterogeneity can introduce substantial bias. In this paper, we develop a Kullback-Leibler-based Cox model (CoxKL) to integrate internal individual-level time-to-event data with external risk scores derived from published prediction models, accounting for population heterogeneity. Partial-likelihood-based KL information is utilized to measure the discrepancy between the external risk information and the internal data. We establish the asymptotic properties of the CoxKL estimator. Simulation studies show that the integration model by the proposed CoxKL method achieves improved estimation efficiency and prediction accuracy. We applied the proposed method to develop a trans-ancestry PHS model for prostate cancer and found that integrating a previously published EUR-based PHS with an internal genotype data of African ancestry (AFR) males yielded considerable improvement on the prostate cancer risk discrimination.
翻译:多基因风险评分(PHS)模型针对欧洲血统(EUR)群体设计,能够提供关于生存风险判别的大量有效信息。将此类信息融入内部小规模非欧洲血统队列中,可提升风险判别的性能。然而,由于外部基于欧洲血统的模型与内部个体层面数据来自不同群体,忽略群体异质性可能会引入显著偏差。本文提出了一种基于Kullback-Leibler散度的Cox模型(CoxKL),用于整合内部个体层面的生存时间数据与已发表预测模型导出的外部风险评分,同时考虑群体异质性。我们利用基于偏似然的KL信息衡量外部风险信息与内部数据之间的差异,并建立了CoxKL估计量的渐近性质。模拟研究表明,通过所提出的CoxKL方法构建的整合模型在估计效率和预测准确性上均有所提升。我们将该方法应用于前列腺癌跨种族PHS模型开发,发现将已发表基于欧洲血统的PHS与非洲血统(AFR)男性内部基因型数据整合后,前列腺癌风险判别性能获得显著改善。