The Granger framework is useful for discovering causal relations in time-varying signals. However, most Granger causality (GC) methods are developed for densely sampled timeseries data. A substantially different setting, particularly common in medical imaging, is the longitudinal study design, where multiple subjects are followed and sparsely observed over time. Longitudinal studies commonly track several biomarkers, which are likely governed by nonlinear dynamics that might have subject-specific idiosyncrasies and exhibit both direct and indirect causes. Furthermore, real-world longitudinal data often suffer from widespread missingness. GC methods are not well-suited to handle these issues. In this paper, we propose an approach named GLACIAL (Granger and LeArning-based CausalIty Analysis for Longitudinal studies) to fill this methodological gap by marrying GC with a multi-task neural forecasting model. GLACIAL treats subjects as independent samples and uses the model's average prediction accuracy on hold-out subjects to probe causal links. Input dropout and model interpolation are used to efficiently learn nonlinear dynamic relationships between a large number of variables and to handle missing values respectively. Extensive simulations and experiments on a real longitudinal medical imaging dataset show GLACIAL beating competitive baselines and confirm its utility. Our code is available at https://github.com/mnhng/GLACIAL.
翻译:格兰杰因果框架对于发现时变信号中的因果关系具有重要价值。然而,大多数格兰杰因果方法是为密集采样的时间序列数据开发的。在医学影像等领域中,纵向研究设计是一种截然不同的常见设置,其中多个受试者被长期跟踪但仅稀疏观测。纵向研究通常追踪多种生物标志物,这些标志物很可能受非线性动态过程调控,可能具有受试者特异性特征,并同时表现出直接与间接的因果关系。此外,真实世界的纵向数据常存在广泛的缺失值问题。传统格兰杰因果方法难以有效处理这些挑战。本文提出一种名为GLACIAL(基于格兰杰因果与学习的纵向研究因果分析)的方法,通过将格兰杰因果框架与多任务神经预测模型相结合,以填补这一方法学空白。GLACIAL将不同受试者视为独立样本,并利用模型在预留受试者上的平均预测精度来探究因果关联。该方法分别采用输入随机丢弃与模型插值技术,以高效学习大量变量间的非线性动态关系并处理缺失值。在大量仿真实验和真实纵向医学影像数据集上的实验表明,GLACIAL优于现有竞争基线方法,验证了其有效性。代码已开源:https://github.com/mnhng/GLACIAL。