基于混合注意力-卷积框架的膀胱血管分割 (Bladder Vessel Segmentation using a Hybrid Attention-Convolution Framework)

Urinary bladder cancer surveillance requires tracking tumor sites across repeated interventions, yet the deformable and hollow bladder lacks stable landmarks for orientation. While blood vessels visible during endoscopy offer a patient-specific "vascular fingerprint" for navigation, automated segmentation is challenged by imperfect endoscopic data, including sparse labels, artifacts like bubbles or variable lighting, continuous deformation, and mucosal folds that mimic vessels. State-of-the-art vessel segmentation methods often fail to address these domain-specific complexities. We introduce a Hybrid Attention-Convolution (HAC) architecture that combines Transformers to capture global vessel topology prior with a CNN that learns a residual refinement map to precisely recover thin-vessel details. To prioritize structural connectivity, the Transformer is trained on optimized ground truth data that exclude short and terminal branches. Furthermore, to address data scarcity, we employ a physics-aware pretraining, that is a self-supervised strategy using clinically grounded augmentations on unlabeled data. Evaluated on the BlaVeS dataset, consisting of endoscopic video frames, our approach achieves high accuracy (0.94) and superior precision (0.61) and clDice (0.66) compared to state-of-the-art medical segmentation models. Crucially, our method successfully suppresses false positives from mucosal folds that dynamically appear and vanish as the bladder fills and empties during surgery. Hence, HAC provides the reliable structural stability required for clinical navigation.

翻译：膀胱癌监测需要在多次介入治疗中追踪肿瘤位置，但膀胱作为可变形空腔器官缺乏稳定的解剖标志用于定位。尽管内窥镜检查中可见的血管能为导航提供患者特异性的"血管指纹"，但自动化分割面临不完美内窥镜数据的挑战：包括稀疏标注、气泡或光照变化等伪影、持续形变以及模拟血管的黏膜皱襞。现有先进的血管分割方法往往无法处理这些领域特异性难题。我们提出一种混合注意力-卷积架构，结合Transformer捕获全局血管拓扑先验，并利用CNN学习残差细化映射以精确恢复细小血管细节。为强化结构连通性，Transformer在优化标注数据上训练，该数据剔除了短小末端分支。此外，针对数据稀缺问题，我们采用物理感知预训练策略，即通过基于临床原理的数据增强对未标注数据进行自监督学习。在由内窥镜视频帧组成的BlaVeS数据集上评估，相比当前最先进的医学分割模型，我们的方法实现了高准确率（0.94）及更优的精确率（0.61）和clDice（0.66）。关键的是，本方法能有效抑制因膀胱在手术中充盈排空动态变化而产生的黏膜皱襞假阳性。因此，HAC架构为临床导航提供了所需的结构稳定性保障。