Computational limits to the legibility of the imaged human brain

Our knowledge of the organisation of the human brain at the population-level is yet to translate into power to predict functional differences at the individual-level, limiting clinical applications, and casting doubt on the generalisability of inferred mechanisms. It remains unknown whether the difficulty arises from the absence of individuating biological patterns within the brain, or from limited power to access them with the models and compute at our disposal. Here we comprehensively investigate the resolvability of such patterns with data and compute at unprecedented scale. Across 23 810 unique participants from UK Biobank, we systematically evaluate the predictability of 25 individual biological characteristics, from all available combinations of structural and functional neuroimaging data. Over 4526 GPU hours of computation, we train, optimize, and evaluate out-of-sample 700 individual predictive models, including fully-connected feed-forward neural networks of demographic, psychological, serological, chronic disease, and functional connectivity characteristics, and both uni- and multi-modal 3D convolutional neural network models of macro- and micro-structural brain imaging. We find a marked discrepancy between the high predictability of sex (balanced accuracy 99.7%), age (mean absolute error 2.048 years, R2 0.859), and weight (mean absolute error 2.609Kg, R2 0.625), for which we set new state-of-the-art performance, and the surprisingly low predictability of other characteristics. Neither structural nor functional imaging predicted psychology better than the coincidence of chronic disease (p<0.05). Serology predicted chronic disease (p<0.05) and was best predicted by it (p<0.001), followed by structural neuroimaging (p<0.05). Our findings suggest either more informative imaging or more powerful models are needed to decipher individual level characteristics from the human brain.

翻译：我们对人群水平上人脑组织的认识，尚未转化为预测个体水平功能差异的能力，这限制了临床应用，并对推断机制的普遍性产生了怀疑。目前尚不清楚，这种困难是由于大脑内缺乏个体化的生物模式，还是由于我们现有模型和计算能力获取这些模式的能力有限。在此，我们以前所未有的数据规模和计算能力，全面研究了此类模式的可解析性。基于英国生物库（UK Biobank）中23,810名独特参与者，我们系统评估了25项个体生物学特征的可预测性，涵盖了所有可用的结构性和功能性神经影像数据组合。经过4526 GPU小时的计算，我们训练、优化并评估了700个个体预测模型的样本外性能，包括针对人口统计学、心理学、血清学、慢性疾病及功能连接性特征的完全连接前馈神经网络模型，以及基于宏观和微观结构脑成像的单模态与多模态3D卷积神经网络模型。我们发现，性别（平衡准确率99.7%）、年龄（平均绝对误差2.048年，R² 0.859）和体重（平均绝对误差2.609公斤，R² 0.625）具有高可预测性，我们为此设定了新的最先进性能，而其他特征的可预测性却出奇地低。无论是结构性还是功能性成像，对心理学的预测均未优于慢性疾病的偶然性（p<0.05）。血清学可预测慢性疾病（p<0.05），且慢性疾病对其预测最佳（p<0.001），其次为结构性神经影像（p<0.05）。我们的研究结果表明，需要更具信息量的成像方法或更强大的模型，才能从人脑中解读个体水平的特征。