Breast ultrasound videos contain richer information than ultrasound images, therefore it is more meaningful to develop video models for this diagnosis task. However, the collection of ultrasound video datasets is much harder. In this paper, we explore the feasibility of enhancing the performance of ultrasound video classification using the static image dataset. To this end, we propose KGA-Net and coherence loss. The KGA-Net adopts both video clips and static images to train the network. The coherence loss uses the feature centers generated by the static images to guide the frame attention in the video model. Our KGA-Net boosts the performance on the public BUSV dataset by a large margin. The visualization results of frame attention prove the explainability of our method. The codes and model weights of our method will be made publicly available.
翻译:乳腺超声视频相较于超声图像包含更丰富的信息,因此针对该诊断任务开发视频模型更具意义。然而,超声视频数据集的采集难度更高。本文探索了利用静态图像数据集提升超声视频分类性能的可行性。为此,我们提出KGA-Net与一致性损失。KGA-Net同时采用视频片段和静态图像训练网络,一致性损失利用静态图像生成的特征中心引导视频模型中的帧注意力机制。我们的KGA-Net在公开的BUSV数据集上取得了大幅性能提升。帧注意力可视化结果证明了本方法的可解释性。本方法的代码与模型权重将公开发布。