Characterizing the genetic basis of survival traits, such as age at disease onset, is critical for risk stratification, early intervention, and elucidating biological mechanisms that can inform therapeutic development. However, time-to-event outcomes in human cohorts are frequently right-censored, complicating both the estimation and partitioning of total heritability. Modern biobanks linked to electronic health records offer the unprecedented power to dissect the genetic basis of age-at-diagnosis traits at large scale. Yet, few methods exist for estimating and partitioning the total heritability of censored survival traits. Existing methods impose restrictive distributional assumptions on genetic and environmental effects and are not scalable to large biobanks with a million subjects. We introduce a censored multiple variance component model to robustly estimate the total heritability of survival traits under right-censoring. We demonstrate through extensive simulations that the method provides accurate total heritability estimates of right-censored traits at censoring rates up to 80% given sufficient sample size. The method is computationally efficient in estimating one hundred genetic variance components of a survival trait using large-scale biobank genotype data consisting of a million subjects and a million SNPs in under nine hours, including uncertainty quantification. We apply our method to estimate the total heritability of four age-at-diagnosis traits from the UK Biobank study. Our results establish a scalable and robust framework for heritability analysis of right-censored survival traits in large-scale genetic studies.
翻译:表征生存性状(如疾病发病年龄)的遗传基础对于风险分层、早期干预以及阐明可指导治疗开发的生物学机制至关重要。然而,人类队列中的时间-事件结局常存在右删失,这使总遗传力的估计与分解变得复杂。与电子健康记录关联的现代生物样本库为大规模解析诊断年龄性状的遗传基础提供了前所未有的能力。然而,目前鲜有方法可用于估计和分解删失生存性状的总遗传力。现有方法对遗传和环境效应施加了限制性分布假设,且无法扩展到包含百万受试者的大型生物样本库。我们提出了一种删失多方差分量模型,以在右删失条件下稳健估计生存性状的总遗传力。通过大量模拟,我们证明在样本量充足的情况下,该方法能在高达80%的删失率下准确估计右删失性状的总遗传力。该方法计算高效,在使用包含百万受试者和百万SNPs的大规模生物样本库基因型数据时,可在九小时内估计一个生存性状的百个遗传方差分量,并包含不确定性量化。我们应用该方法估计了英国生物样本库研究中四种诊断年龄性状的总遗传力。我们的结果为大规模遗传研究中右删失生存性状的遗传力分析建立了一个可扩展且稳健的框架。