Large-scale genome-wide association studies (GWAS) have offered an exciting opportunity to discover putative causal genes or risk factors associated with diseases by using SNPs as instrumental variables (IVs). However, conventional approaches assume linear causal relations partly for simplicity and partly for the availability of GWAS summary data. In this work, we propose a novel model {for transcriptome-wide association studies (TWAS)} to incorporate nonlinear relationships across IVs, an exposure/gene, and an outcome, which is robust against violations of the valid IV assumptions, permits the use of GWAS summary data, and covers two-stage least squares as a special case. We decouple the estimation of a marginal causal effect and a nonlinear transformation, where the former is estimated via sliced inverse regression and a sparse instrumental variable regression, and the latter is estimated by a ratio-adjusted inverse regression. On this ground, we propose an inferential procedure. An application of the proposed method to the ADNI gene expression data and the IGAP GWAS summary data identifies 18 causal genes associated with Alzheimer's disease, including APOE and TOMM40, in addition to 7 other genes missed by two-stage least squares considering only linear relationships. Our findings suggest that nonlinear modeling is required to unleash the power of IV regression for identifying potentially nonlinear gene-trait associations. Accompanying this paper is our Python library \texttt{nl-causal} (\url{https://nonlinear-causal.readthedocs.io/}) that implements the proposed method.
翻译:大规模全基因组关联研究(GWAS)通过将SNP作为工具变量(IVs),为发现与疾病相关的推定因果基因或风险因素提供了令人兴奋的机遇。然而,传统方法出于简化考虑和GWAS汇总数据的可用性,通常假设线性因果关系。本文针对转录组全关联研究(TWAS)提出了一种新模型,该模型能够整合工具变量、暴露/基因与结局之间的非线性关系,对有效IV假设的违反具有稳健性,允许使用GWAS汇总数据,并将两阶段最小二乘作为其特例。我们将边际因果效应与非线性变换的估计解耦:前者通过切片逆回归和稀疏工具变量回归估计,后者则通过比率调整逆回归估计。在此基础上,我们提出了一套推断流程。将该方法应用于ADNI基因表达数据和IGAP GWAS汇总数据,我们识别出与阿尔茨海默病相关的18个因果基因,包括APOE和TOMM40,此外还有7个被仅考虑线性关系的两阶段最小二乘法遗漏的基因。研究结果表明,为充分发挥IV回归识别潜在非线性基因-性状关联的能力,非线性建模不可或缺。随本文附带的Python库\texttt{nl-causal}(\url{https://nonlinear-causal.readthedocs.io/})实现了所提出的方法。