Statistical techniques are needed to analyse data structures with complex dependencies such that clinically useful information can be extracted. Individual-specific networks, which capture dependencies in complex biological systems, are often summarized by graph-theoretical features. These features, which lend themselves to outcome modelling, can be subject to high variability due to arbitrary decisions in network inference and noise. Correlation-based adjacency matrices often need to be sparsified before meaningful graph-theoretical features can be extracted, requiring the data analysts to determine an optimal threshold.. To address this issue, we propose to incorporate a flexible weighting function over the full range of possible thresholds to capture the variability of graph-theoretical features over the threshold domain. The potential of this approach, which extends concepts from functional data analysis to a graph-theoretical setting, is explored in a plasmode simulation study using real functional magnetic resonance imaging (fMRI) data from the Autism Brain Imaging Data Exchange (ABIDE) Preprocessed initiative. The simulations show that our modelling approach yields accurate estimates of the functional form of the weight function, improves inference efficiency, and achieves a comparable or reduced root mean square prediction error compared to competitor modelling approaches. This assertion holds true in settings where both complex functional forms underlie the outcome-generating process and a universal threshold value is employed. We demonstrate the practical utility of our approach by using resting-state fMRI data to predict biological age in children. Our study establishes the flexible modelling approach as a statistically principled, serious competitor to ad-hoc methods with superior performance.
翻译:需要统计技术来分析具有复杂依赖关系的数据结构,从而提取临床上有用的信息。个体特异性网络可捕捉复杂生物系统中的依赖关系,通常通过图论特征进行概括。这些特征适用于结果建模,但可能因网络推断中的任意决策和噪声而产生高度变异性。基于相关性的邻接矩阵通常需要稀疏化才能提取有意义的图论特征,这要求数据分析者确定最优阈值。为解决此问题,我们提出在完整阈值范围内纳入灵活权重函数,以捕捉图论特征在阈值域上的变异性。该方法将函数数据分析的概念扩展至图论领域,其潜力通过使用来自自闭症脑成像数据交换(ABIDE)预处理计划的真实功能磁共振成像(fMRI)数据进行的血浆模拟研究进行了探索。模拟表明,我们的建模方法能够准确估计权重函数的函数形式,提高推断效率,并在均方根预测误差方面与竞争对手建模方法相比达到相当或更低的水平。这一结论在结果生成过程具有复杂函数形式以及采用通用阈值的情况下均成立。我们通过使用静息态fMRI数据预测儿童生物学年龄,展示了该方法在实际中的应用价值。本研究建立了一种具有统计原理、性能优于临时方法的灵活建模方法,成为其强有力的竞争方案。