Breast lesion segmentation in ultrasound (US) videos is essential for diagnosing and treating axillary lymph node metastasis. However, the lack of a well-established and large-scale ultrasound video dataset with high-quality annotations has posed a persistent challenge for the research community. To overcome this issue, we meticulously curated a US video breast lesion segmentation dataset comprising 572 videos and 34,300 annotated frames, covering a wide range of realistic clinical scenarios. Furthermore, we propose a novel frequency and localization feature aggregation network (FLA-Net) that learns temporal features from the frequency domain and predicts additional lesion location positions to assist with breast lesion segmentation. We also devise a localization-based contrastive loss to reduce the lesion location distance between neighboring video frames within the same video and enlarge the location distances between frames from different ultrasound videos. Our experiments on our annotated dataset and two public video polyp segmentation datasets demonstrate that our proposed FLA-Net achieves state-of-the-art performance in breast lesion segmentation in US videos and video polyp segmentation while significantly reducing time and space complexity. Our model and dataset are available at https://github.com/jhl-Det/FLA-Net.
翻译:乳腺病灶分割在超声(US)视频中对于诊断和治疗腋窝淋巴结转移至关重要。然而,缺乏一个具备高质量标注的完善大规模超声视频数据集,长期困扰着研究界。为解决这一问题,我们精心整理了一个超声视频乳腺病灶分割数据集,包含572个视频和34,300个标注帧,覆盖了广泛的真实临床场景。此外,我们提出了一种新颖的频率与定位特征聚合网络(FLA-Net),该网络从频域学习时间特征,并预测额外的病灶位置信息以辅助乳腺病灶分割。我们还设计了一种基于定位的对比损失,用于减少同一视频内相邻帧之间的病灶位置距离,同时增大不同超声视频帧之间的位置距离。我们在我们标注的数据集以及两个公开的视频息肉分割数据集上的实验表明,提出的FLA-Net在超声视频乳腺病灶分割和视频息肉分割中达到了最先进的性能,同时显著降低了时间和空间复杂度。我们的模型和数据集可在 https://github.com/jhl-Det/FLA-Net 获取。