2D visual foundation models, such as DINOv3, a self-supervised model trained on large-scale natural images, have demonstrated strong zero-shot generalization, capturing both rich global context and fine-grained structural cues. However, an analogous 3D foundation model for downstream volumetric neuroimaging remains lacking, largely due to the challenges of 3D image acquisition and the scarcity of high-quality annotations. To address this gap, we propose to adapt the 2D visual representations learned by DINOv3 to a 3D biomedical segmentation model, enabling more data-efficient and morphologically faithful neuronal reconstruction. Specifically, we design an inflation-based adaptation strategy that inflates 2D filters into 3D operators, preserving semantic priors from DINOv3 while adapting to 3D neuronal volume patches. In addition, we introduce a topology-aware skeleton loss to explicitly enforce structural fidelity of graph-based neuronal arbor reconstruction. Extensive experiments on four neuronal imaging datasets, including two from BigNeuron and two public datasets, NeuroFly and CWMBS, demonstrate consistent improvements in reconstruction accuracy over SoTA methods, with average gains of 2.9% in Entire Structure Average, 2.8% in Different Structure Average, and 3.8% in Percentage of Different Structure. Code: https://github.com/yy0007/NeurINO.
翻译:二维视觉基础模型,例如在大规模自然图像上训练的自监督模型 DINOv3,已展现出强大的零样本泛化能力,能够捕获丰富的全局上下文和精细的结构线索。然而,针对下游体素神经影像分析的三维基础模型仍然缺失,其主要原因在于三维图像采集的挑战以及高质量标注的稀缺性。为弥补这一空白,我们提出将 DINOv3 学习到的二维视觉表征适配到三维生物医学分割模型,从而实现更具数据效率且形态学上更保真的神经元重建。具体而言,我们设计了一种基于膨胀的适配策略,将二维滤波器膨胀为三维算子,在保留 DINOv3 语义先验的同时适应三维神经元体素块。此外,我们引入了一种拓扑感知的骨架损失函数,以显式加强基于图的神经元树状重建的结构保真性。在四个神经元成像数据集(包括两个来自 BigNeuron 的数据集以及两个公开数据集 NeuroFly 和 CWMBS)上的大量实验表明,与现有最优方法相比,我们的方法在重建精度上取得了一致提升,平均增益分别为:整体结构平均指标提升 2.9%、不同结构平均指标提升 2.8%、不同结构百分比指标提升 3.8%。代码:https://github.com/yy0007/NeurINO。