Accurate anatomical landmark detection in medical images is crucial for clinical applications. Existing methods often struggle to balance global context with computational efficiency, particularly with high-resolution images. This paper introduces the Hybrid Attention Network(HAN), a novel hybrid architecture integrating CNNs and Transformers. Its core is the BiFormer module, utilizing Bi-Level Routing Attention (BRA) for efficient attention to relevant image regions. This, combined with Convolutional Attention Blocks (CAB) enhanced by CBAM, enables precise local feature refinement guided by the global context. A Feature Fusion Correction Module (FFCM) integrates multi-scale features, mitigating resolution loss. Deep supervision with MSE loss on multi-resolution heatmaps optimizes the model. Experiments on five diverse datasets demonstrate state-of-the-art performance, surpassing existing methods in accuracy, robustness, and efficiency. The HAN provides a promising solution for accurate and efficient anatomical landmark detection in complex medical images. Our codes and data will be released soon at: \url{https://github.com/MIRACLE-Center/}.
翻译:医学图像中精确的解剖标志点检测对于临床应用至关重要。现有方法往往难以在全局上下文与计算效率之间取得平衡,尤其是在处理高分辨率图像时。本文提出了混合注意力网络(HAN),这是一种集成CNN与Transformer的新型混合架构。其核心是BiFormer模块,该模块利用双级路由注意力(BRA)机制,实现对相关图像区域的高效注意力聚焦。该模块与通过CBAM增强的卷积注意力块(CAB)相结合,能够在全局上下文指导下实现精确的局部特征细化。特征融合校正模块(FFCM)整合了多尺度特征,缓解了分辨率损失问题。通过在多分辨率热图上使用均方误差损失进行深度监督,对模型进行了优化。在五个不同数据集上的实验表明,该方法取得了最先进的性能,在准确性、鲁棒性和效率方面均超越了现有方法。HAN为复杂医学图像中准确且高效的解剖标志点检测提供了一个有前景的解决方案。我们的代码和数据将很快发布于:\url{https://github.com/MIRACLE-Center/}。