Training deep neural networks for 3D segmentation tasks can be challenging, often requiring efficient and effective strategies to improve model performance. In this study, we introduce a novel approach, DeCode, that utilizes label-derived features for model conditioning to support the decoder in the reconstruction process dynamically, aiming to enhance the efficiency of the training process. DeCode focuses on improving 3D segmentation performance through the incorporation of conditioning embedding with learned numerical representation of 3D-label shape features. Specifically, we develop an approach, where conditioning is applied during the training phase to guide the network toward robust segmentation. When labels are not available during inference, our model infers the necessary conditioning embedding directly from the input data, thanks to a feed-forward network learned during the training phase. This approach is tested using synthetic data and cone-beam computed tomography (CBCT) images of teeth. For CBCT, three datasets are used: one publicly available and two in-house. Our results show that DeCode significantly outperforms traditional, unconditioned models in terms of generalization to unseen data, achieving higher accuracy at a reduced computational cost. This work represents the first of its kind to explore conditioning strategies in 3D data segmentation, offering a novel and more efficient method for leveraging annotated data. Our code, pre-trained models are publicly available at https://github.com/SanoScience/DeCode .
翻译:训练用于三维分割任务的深度神经网络具有挑战性,通常需要高效且有效的策略来提升模型性能。本研究提出一种新颖方法DeCode,该方法利用标签衍生的特征进行模型条件化,以在重建过程中动态支持解码器,旨在提升训练过程的效率。DeCode通过将条件化嵌入与学习得到的三维标签形状特征数值表示相结合,专注于改善三维分割性能。具体而言,我们开发了一种在训练阶段施加条件化以引导网络实现稳健分割的方法。在推理阶段缺少标签时,得益于训练阶段学习到的前馈网络,我们的模型可直接从输入数据推断出所需的条件化嵌入。该方法通过合成数据及牙齿锥形束计算机断层扫描(CBCT)图像进行验证。针对CBCT数据,我们使用了三个数据集:一个公开数据集和两个内部数据集。实验结果表明,在未见数据的泛化能力方面,DeCode显著优于传统的无条件化模型,能够以更低计算成本获得更高精度。本研究开创性地探索了三维数据分割中的条件化策略,为利用标注数据提供了一种新颖且更高效的方法。我们的代码与预训练模型已公开于https://github.com/SanoScience/DeCode。