Large-scale neuroimaging studies often collect data from multiple scanners across different sites, where variations in scanners, scanning procedures, and other conditions across sites can introduce artificial site effects. These effects may bias brain connectivity measures, such as functional connectivity (FC), which quantify functional network organization derived from functional magnetic resonance imaging (fMRI). How to leverage high-dimensional network structures to effectively mitigate site effects has yet to be addressed. In this paper, we propose SLACC (Sparse LAtent Covariate-driven Connectome) factorization, a multivariate method that explicitly parameterizes covariate effects in latent subject scores corresponding to sparse rank-1 latent patterns derived from brain connectivity. The proposed method identifies localized site-driven variability within and across brain networks, enabling targeted correction. We develop a penalized Expectation-Maximization (EM) algorithm for parameter estimation, incorporating the Bayesian Information Criterion (BIC) to guide optimization. Extensive simulations validate SLACC's robustness in recovering the true parameters and underlying connectivity patterns. Applied to the Autism Brain Imaging Data Exchange (ABIDE) dataset, SLACC demonstrates its ability to reduce site effects. The R package to implement our method is publicly available.
翻译:大规模神经影像研究通常从不同站点的多个扫描仪收集数据,其中扫描仪、扫描程序以及站点间其他条件的差异可能引入人为的站点效应。这些效应可能使脑连接度量(如功能连接性)产生偏差,这些度量量化了源自功能磁共振成像的功能网络组织。如何利用高维网络结构有效缓解站点效应仍有待解决。本文提出SLACC(稀疏潜在协变量驱动连接组)分解方法,这是一种多变量方法,它显式地将协变量效应参数化到潜在主体得分中,这些得分对应于从脑连接性导出的稀疏秩-1潜在模式。所提方法识别了脑网络内部及跨网络的局部站点驱动变异性,从而实现针对性校正。我们开发了一种用于参数估计的惩罚期望最大化算法,并结合贝叶斯信息准则以指导优化。大量仿真验证了SLACC在恢复真实参数和底层连接模式方面的鲁棒性。应用于自闭症脑影像数据交换数据集时,SLACC展示了其降低站点效应的能力。实现本方法的R包已公开提供。