Surface Masked AutoEncoder: Self-Supervision for Cortical Imaging Data

Self-supervision has been widely explored as a means of addressing the lack of inductive biases in vision transformer architectures, which limits generalisation when networks are trained on small datasets. This is crucial in the context of cortical imaging, where phenotypes are complex and heterogeneous, but the available datasets are limited in size. This paper builds upon recent advancements in translating vision transformers to surface meshes and investigates the potential of Masked AutoEncoder (MAE) self-supervision for cortical surface learning. By reconstructing surface data from a masked version of the input, the proposed method effectively models cortical structure to learn strong representations that translate to improved performance in downstream tasks. We evaluate our approach on cortical phenotype regression using the developing Human Connectome Project (dHCP) and demonstrate that pre-training leads to a 26\% improvement in performance, with an 80\% faster convergence, compared to models trained from scratch. Furthermore, we establish that pre-training vision transformer models on large datasets, such as the UK Biobank (UKB), enables the acquisition of robust representations for finetuning in low-data scenarios. Our code and pre-trained models are publicly available at \url{https://github.com/metrics-lab/surface-vision-transformers}.

翻译：自监督学习被广泛探索用于解决视觉Transformer架构中归纳偏置缺失的问题，这种缺失限制了网络在小数据集上训练时的泛化能力。这在皮层影像领域尤为关键——表型复杂多样且数据集规模有限。本文基于将视觉Transformer迁移至表面网格的最新进展，探究了掩码自编码器（MAE）自监督方法在皮层表面学习中的潜力。通过从输入数据的掩码版本中重建表面数据，所提方法有效建模皮层结构以学习强表征，进而提升下游任务性能。我们在发展性人类连接组计划（dHCP）数据集上对皮层表型回归任务进行评估，结果表明：相较于从零训练的模型，预训练可使性能提升26%且收敛速度加快80%。此外，我们在UK Biobank（UKB）等大规模数据集上验证，视觉Transformer模型的预训练能够获取鲁棒表征，从而支持低数据场景下的微调。我们的代码与预训练模型已开源至 \url{https://github.com/metrics-lab/surface-vision-transformers}。

相关内容

Microsoft Surface

关注 5

Surface 是微软公司（ Microsoft）旗下一系列使用 Windows 10（早期为 Windows 8.X）操作系统的电脑产品，目前有 Surface、Surface Pro 和 Surface Book 三个系列。 2012 年 6 月 18 日，初代 Surface Pro/RT 由时任微软 CEO 史蒂夫·鲍尔默发布于在洛杉矶举行的记者会，2012 年 10 月 26 日上市销售。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日