A 3D Multi-Style Cross-Modality Segmentation Framework for Segmenting Vestibular Schwannoma and Cochlea

The crossMoDA2023 challenge aims to segment the vestibular schwannoma (sub-divided into intra- and extra-meatal components) and cochlea regions of unlabeled hrT2 scans by leveraging labeled ceT1 scans. In this work, we proposed a 3D multi-style cross-modality segmentation framework for the crossMoDA2023 challenge, including the multi-style translation and self-training segmentation phases. Considering heterogeneous distributions and various image sizes in multi-institutional scans, we first utilize the min-max normalization, voxel size resampling, and center cropping to obtain fixed-size sub-volumes from ceT1 and hrT2 scans for training. Then, we perform the multi-style image translation phase to overcome the intensity distribution discrepancy between unpaired multi-modal scans. Specifically, we design three different translation networks with 2D or 2.5D inputs to generate multi-style and realistic target-like volumes from labeled ceT1 volumes. Finally, we perform the self-training volumetric segmentation phase in the target domain, which employs the nnU-Net framework and iterative self-training method using pseudo-labels for training accurate segmentation models in the unlabeled target domain. On the crossMoDA2023 validation dataset, our method produces promising results and achieves the mean DSC values of 72.78% and 80.64% and ASSD values of 5.85 mm and 0.25 mm for VS tumor and cochlea regions, respectively. Moreover, for intra- and extra-meatal regions, our method achieves the DSC values of 59.77% and 77.14%, respectively.

翻译：crossMoDA2023挑战赛旨在利用标注的ceT1扫描图像，对未标注的hrT2扫描图像中的前庭神经鞘瘤（细分至内听道内与内听道外部分）及耳蜗区域进行分割。在本研究中，我们针对crossMoDA2023挑战赛提出了一种3D多风格跨模态分割框架，包括多风格转换阶段与自训练分割阶段。考虑到多中心扫描中存在的异质性分布及不同图像尺寸，我们首先采用最小-最大归一化、体素尺寸重采样及中心裁剪，从ceT1与hrT2扫描图像中获取固定大小的子体积用于训练。随后，我们执行多风格图像转换阶段，以克服非配对多模态扫描图像之间的强度分布差异。具体而言，我们设计了三种基于2D或2.5D输入的不同转换网络，从标注的ceT1体素中生成多风格且逼真的目标域体素。最后，我们在目标域中执行自训练体积分割阶段，该阶段采用nnU-Net框架及利用伪标签的迭代自训练方法，以在未标注的目标域中训练精确的分割模型。在crossMoDA2023验证数据集上，我们的方法取得了令人满意的结果，针对前庭神经鞘瘤肿瘤与耳蜗区域，平均Dice相似系数（DSC）分别达到72.78%与80.64%，平均对称表面距离（ASSD）分别为5.85毫米与0.25毫米。此外，针对内听道内与内听道外区域，我们的方法分别取得了59.77%与77.14%的DSC值。