With the growing volume of CT examinations, there is an increasing demand for automated tools such as organ segmentation, abnormality detection, and report generation to support radiologists in managing their clinical workload. Multi-label classification of 3D Chest CT scans remains a critical yet challenging problem due to the complex spatial relationships inherent in volumetric data and the wide variability of abnormalities. Existing methods based on 3D convolutional neural networks struggle to capture long-range dependencies, while Vision Transformers often require extensive pre-training on large-scale, domain-specific datasets to perform competitively. In this work of academic research, we propose a 2.5D alternative by introducing a new graph-based framework that represents 3D CT volumes as structured graphs, where axial slice triplets serve as nodes processed through spectral graph convolution, enabling the model to reason over inter-slice dependencies while maintaining complexity compatible with clinical deployment. Our method, trained and evaluated on 3 datasets from independent institutions, achieves strong cross-dataset generalization, and shows competitive performance compared to state-of-the-art visual encoders. We further conduct comprehensive ablation studies to evaluate the impact of various aggregation strategies, edge-weighting schemes, and graph connectivity patterns. Additionally, we demonstrate the broader applicability of our approach through transfer experiments on automated radiology report generation and abdominal CT data.
翻译:随着CT检查数量的日益增长,对器官分割、异常检测和报告生成等自动化工具的需求不断增加,以辅助放射科医生管理临床工作负荷。由于体数据固有的复杂空间关系及异常的广泛变异性,3D胸部CT扫描的多标签分类仍是一个关键且具有挑战性的问题。基于3D卷积神经网络的现有方法难以捕捉长程依赖关系,而视觉Transformer通常需要在大规模领域特定数据集上进行广泛预训练才能获得有竞争力的性能。在本学术研究中,我们提出一种2.5D替代方案,引入基于图的新框架:将3D CT体数据表示为结构化图,其中轴向切片三元组作为节点通过谱图卷积进行处理,使模型能够推理切片间依赖关系,同时保持与临床部署兼容的复杂度。我们在来自独立机构的3个数据集上训练和评估所提方法,实现了强大的跨数据集泛化能力,并与最先进的视觉编码器相比展现出有竞争力的性能。我们进一步通过全面的消融研究评估了不同聚合策略、边权重方案和图连接模式的影响。此外,我们通过自动化放射学报告生成和腹部CT数据的迁移实验,证明了该方法具有更广泛的适用性。