In addressing the challenge of analysing the large-scale Adolescent Brain Cognition Development (ABCD) fMRI dataset, involving over 5,000 subjects and extensive neuroimaging data, we propose a scalable Bayesian scalar-on-image regression model for computational feasibility and efficiency. Our model employs a relaxed-thresholded Gaussian process (RTGP), integrating piecewise-smooth, sparse, and continuous functions capable of both hard- and soft-thresholding. This approach introduces additional flexibility in feature selection in scalar-on-image regression and leads to scalable posterior computation by adopting a variational approximation and utilising the Karhunen-Lo\`eve expansion for Gaussian processes. This advancement substantially reduces the computational costs in vertex-wise analysis of cortical surface data in large-scale Bayesian spatial models. The model's parameter estimation and prediction accuracy and feature selection performance are validated through extensive simulation studies and an application to the ABCD study. Here, we perform regression analysis correlating intelligence scores with task-based functional MRI data, taking into account confounding factors including age, sex, and parental education level. This validation highlights our model's capability to handle large-scale neuroimaging data while maintaining computational feasibility and accuracy.
翻译:为应对大规模青少年脑认知发展(ABCD)fMRI数据集(涉及5000多名受试者及大量神经影像数据)分析挑战,我们提出一种兼具计算可行性与高效性的可扩展贝叶斯标量-图像回归模型。该模型采用松弛阈值高斯过程(RTGP),融合分段平滑、稀疏且兼具硬阈值与软阈值函数的连续函数。该方法增强标量-图像回归中特征选择的灵活性,并通过变分近似与高斯过程的Karhunen-Loève展开实现可扩展后验计算。这一进展显著降低了大规模贝叶斯空间模型中皮层表面顶点分析的运算成本。通过广泛模拟研究及ABCD数据集应用,验证了模型的参数估计准确性、预测性能及特征选择能力。我们在此将智力评分与任务态功能磁共振成像数据进行回归分析,同时校正年龄、性别及父母教育水平等混杂因素。实证结果突出显示了本模型在保持计算可行性与精确性的前提下处理大规模神经影像数据的能力。