Accurate segmentation of colorectal polyps in colonoscopy images is crucial for effective diagnosis and management of colorectal cancer (CRC). However, current deep learning-based methods primarily rely on fusing RGB information across multiple scales, leading to limitations in accurately identifying polyps due to restricted RGB domain information and challenges in feature misalignment during multi-scale aggregation. To address these limitations, we propose the Polyp Segmentation Network with Shunted Transformer (PSTNet), a novel approach that integrates both RGB and frequency domain cues present in the images. PSTNet comprises three key modules: the Frequency Characterization Attention Module (FCAM) for extracting frequency cues and capturing polyp characteristics, the Feature Supplementary Alignment Module (FSAM) for aligning semantic information and reducing misalignment noise, and the Cross Perception localization Module (CPM) for synergizing frequency cues with high-level semantics to achieve efficient polyp segmentation. Extensive experiments on challenging datasets demonstrate PSTNet's significant improvement in polyp segmentation accuracy across various metrics, consistently outperforming state-of-the-art methods. The integration of frequency domain cues and the novel architectural design of PSTNet contribute to advancing computer-assisted polyp segmentation, facilitating more accurate diagnosis and management of CRC.
翻译:结肠镜图像中结直肠息肉的精确分割对于结直肠癌(CRC)的有效诊断与管理至关重要。然而,当前基于深度学习的方法主要依赖于融合多尺度RGB信息,由于RGB域信息受限以及多尺度聚合过程中的特征错位挑战,导致在精确识别息肉方面存在局限。为应对这些局限,我们提出了基于分流Transformer的息肉分割网络(PSTNet),这是一种集成图像中RGB与频域线索的新方法。PSTNet包含三个关键模块:用于提取频域线索并捕获息肉特征的频域表征注意力模块(FCAM),用于对齐语义信息并减少错位噪声的特征补充对齐模块(FSAM),以及用于协同频域线索与高层语义以实现高效息肉分割的交叉感知定位模块(CPM)。在多个具有挑战性的数据集上进行的大量实验表明,PSTNet在各种评估指标上均显著提升了息肉分割的准确性,持续优于现有最先进方法。频域线索的集成与PSTNet新颖的架构设计,共同推动了计算机辅助息肉分割技术的进步,有助于实现更精确的CRC诊断与管理。