Accurate and automated lesion segmentation in Positron Emission Tomography / Computed Tomography (PET/CT) imaging is essential for cancer diagnosis and therapy planning. This paper presents a Swin Transformer UNet 3D (SwinUNet3D) framework for lesion segmentation in Fluorodeoxyglucose Positron Emission Tomography / Computed Tomography (FDG-PET/CT) scans. By combining shifted window self-attention with U-Net style skip connections, the model captures both global context and fine anatomical detail. We evaluate SwinUNet3D on the AutoPET III FDG dataset and compare it against a baseline 3D U-Net. Results show that SwinUNet3D achieves a Dice score of 0.88 and IoU of 0.78, surpassing 3D U-Net (Dice 0.48, IoU 0.32) while also delivering faster inference times. Qualitative analysis demonstrates improved detection of small and irregular lesions, reduced false positives, and more accurate PET/CT fusion. While the framework is currently limited to FDG scans and trained under modest GPU resources, it establishes a strong foundation for future multi-tracer, multi-center evaluations and benchmarking against other transformer-based architectures. Overall, SwinUNet3D represents an efficient and robust approach to PET/CT lesion segmentation, advancing the integration of transformer-based models into oncology imaging workflows.
翻译:正电子发射断层扫描/计算机断层扫描(PET/CT)成像中准确且自动化的病灶分割对于癌症诊断与治疗规划至关重要。本文提出了一种用于氟代脱氧葡萄糖正电子发射断层扫描/计算机断层扫描(FDG-PET/CT)扫描中病灶分割的Swin Transformer UNet 3D(SwinUNet3D)框架。该模型通过将移位窗口自注意力机制与U-Net风格的跳跃连接相结合,能够同时捕获全局上下文信息与精细的解剖细节。我们在AutoPET III FDG数据集上评估SwinUNet3D,并与基准3D U-Net模型进行对比。实验结果表明,SwinUNet3D取得了0.88的Dice系数和0.78的交并比(IoU),显著优于3D U-Net(Dice 0.48,IoU 0.32),同时具有更快的推理速度。定性分析显示,该模型提升了对小型及不规则病灶的检测能力,减少了假阳性结果,并实现了更精确的PET/CT融合。尽管该框架目前仅限于FDG扫描数据且训练时GPU资源有限,但它为未来开展多示踪剂、多中心评估以及与其他基于Transformer架构的基准比较奠定了坚实基础。总体而言,SwinUNet3D为PET/CT病灶分割提供了一种高效且鲁棒的方法,推动了基于Transformer的模型在肿瘤影像工作流程中的集成应用。