Genome-wide association studies (GWASs) have been extensively adopted to depict the underlying genetic architecture of complex diseases. Motivated by GWASs' limitations in identifying small effect loci to understand complex traits' polygenicity and fine-mapping putative causal variants from proxy ones, we propose a knockoff-based method which only requires summary statistics from GWASs and demonstrate its validity in the presence of relatedness. We show that GhostKnockoffs inference is robust to its input Z-scores as long as they are from valid marginal association tests and their correlations are consistent with the correlations among the corresponding genetic variants. The property generalizes GhostKnockoffs to other GWASs settings, such as the meta-analysis of multiple overlapping studies and studies based on association test statistics deviated from score tests. We demonstrate GhostKnockoffs' performance using empirical simulation and a meta-analysis of nine European ancestral genome-wide association studies and whole exome/genome sequencing studies. Both results demonstrate that GhostKnockoffs identify more putative causal variants with weak genotype-phenotype associations that are missed by conventional GWASs.
翻译:全基因组关联研究已被广泛用于揭示复杂疾病的潜在遗传结构。针对GWAS在识别小效应位点以理解复杂性状的多基因性以及从代理变异中精细定位推定因果变异方面的局限性,我们提出了一种基于knockoff的方法,该方法仅需GWAS的汇总统计量,并证明了其在存在亲属关系情况下的有效性。我们表明,只要GhostKnockoffs的输入Z分数来自有效的边际关联检验且其相关性与相应遗传变异间的相关性一致,则该推断方法对这些Z分数具有稳健性。这一特性将GhostKnockoffs推广到其他GWAS场景,例如多重叠研究的荟萃分析以及基于偏离得分检验的关联检验统计量的研究。我们通过实证模拟和对九项欧洲祖先全基因组关联研究及全外显子/全基因组测序研究的荟萃分析,展示了GhostKnockoffs的性能。两项结果均表明,GhostKnockoffs能够识别更多传统GWAS遗漏的具有微弱基因型-表型关联的推定因果变异。