Transformer and its variants have been widely used for medical image segmentation. However, the large number of parameter and computational load of these models make them unsuitable for mobile health applications. To address this issue, we propose a more efficient approach, the Efficient Group Enhanced UNet (EGE-UNet). We incorporate a Group multi-axis Hadamard Product Attention module (GHPA) and a Group Aggregation Bridge module (GAB) in a lightweight manner. The GHPA groups input features and performs Hadamard Product Attention mechanism (HPA) on different axes to extract pathological information from diverse perspectives. The GAB effectively fuses multi-scale information by grouping low-level features, high-level features, and a mask generated by the decoder at each stage. Comprehensive experiments on the ISIC2017 and ISIC2018 datasets demonstrate that EGE-UNet outperforms existing state-of-the-art methods. In short, compared to the TransFuse, our model achieves superior segmentation performance while reducing parameter and computation costs by 494x and 160x, respectively. Moreover, to our best knowledge, this is the first model with a parameter count limited to just 50KB. Our code is available at https://github.com/JCruan519/EGE-UNet.
翻译:Transformer及其变体已被广泛应用于医学图像分割。然而,这些模型庞大的参数量和计算负担使其不适用于移动健康应用。为解决这一问题,我们提出了一种更高效的方法——高效分组增强UNet(EGE-UNet)。我们以轻量级方式集成了分组多轴哈达玛积注意力模块(GHPA)和分组聚合桥模块(GAB)。GHPA对输入特征进行分组,并在不同轴上执行哈达玛积注意力机制(HPA),以从多角度提取病理信息。GAB通过在每个阶段对低层特征、高层特征以及解码器生成的掩码进行分组,有效融合多尺度信息。在ISIC2017和ISIC2018数据集上的综合实验表明,EGE-UNet优于现有最先进方法。简言之,与TransFuse相比,我们的模型在实现更优分割性能的同时,参数量和计算成本分别降低了494倍和160倍。此外,据我们所知,这是首个参数量仅限于50KB的模型。我们的代码已开源至https://github.com/JCruan519/EGE-UNet。