Linkage analysis has provided valuable insights to the GWAS studies, particularly in revealing that SNPs in linkage disequilibrium (LD) can jointly influence disease phenotypes. However, the potential of LD network data has often been overlooked or underutilized in the literature. In this paper, we propose a locally adaptive structure learning algorithm (LASLA) that provides a principled and generic framework for incorporating network data or multiple samples of auxiliary data from related source domains; possibly in different dimensions/structures and from diverse populations. LASLA employs a $p$-value weighting approach, utilizing structural insights to assign data-driven weights to individual test points. Theoretical analysis shows that LASLA can asymptotically control FDR with independent or weakly dependent primary statistics, and achieve higher power when the network data is informative. Efficiency again of LASLA is illustrated through various synthetic experiments and an application to T2D-associated SNP identification.
翻译:连锁分析为全基因组关联研究提供了宝贵见解,特别是揭示了连锁不平衡中的单核苷酸多态性可共同影响疾病表型。然而,连锁不平衡网络数据的潜力在文献中常被忽视或未充分利用。本文提出一种局部自适应结构学习算法,为整合网络数据或来自相关源域的辅助数据(可能具有不同维度/结构且来自不同族群)提供了原则性且通用的框架。该方法采用p值加权策略,利用结构信息为各检验点分配数据驱动的权重。理论分析表明,当主统计量独立或弱相关时,LASLA能够渐近控制错误发现率,并在网络数据提供有效信息时获得更高检验效能。通过多种合成实验及2型糖尿病相关SNP识别应用,验证了LASLA的高效性。