Heatmap-based anatomical landmark detection is still facing two unresolved challenges: 1) inability to accurately evaluate the distribution of heatmap; 2) inability to effectively exploit global spatial structure information. To address the computational inability challenge, we propose a novel position-aware and sample-aware central loss. Specifically, our central loss can absorb position information, enabling accurate evaluation of the heatmap distribution. More advanced is that our central loss is sample-aware, which can adaptively distinguish easy and hard samples and make the model more focused on hard samples while solving the challenge of extreme imbalance between landmarks and non-landmarks. To address the challenge of ignoring structure information, a Coordinated Transformer, called CoorTransformer, is proposed, which establishes long-range dependencies under the guidance of landmark coordination information, making the attention more focused on the sparse landmarks while taking advantage of global spatial structure. Furthermore, CoorTransformer can speed up convergence, effectively avoiding the defect that Transformers have difficulty converging in sparse representation learning. Using the advanced CoorTransformer and central loss, we propose a generalized detection model that can handle various scenarios, inherently exploiting the underlying relationship between landmarks and incorporating rich structural knowledge around the target landmarks. We analyzed and evaluated CoorTransformer and central loss on three challenging landmark detection tasks. The experimental results show that our CoorTransformer outperforms state-of-the-art methods, and the central loss significantly improves the performance of the model with p-values< 0.05.
翻译:基于热力图的人体解剖标志点检测仍面临两个未解决的挑战:1)无法准确评估热力图分布;2)无法有效利用全局空间结构信息。针对第一个计算能力挑战,我们提出了一种新颖的位置感知与样本感知中心损失函数。具体而言,该中心损失函数能够吸收位置信息,实现对热力图分布的精确评估。更先进的是,该损失函数具有样本感知特性,可自适应区分简单样本与困难样本,在解决标志点与非标志点极端不平衡问题的同时,使模型更加关注困难样本。为克服结构信息被忽视的挑战,我们提出了名为CoorTransformer的协调Transformer,该模型在标志点坐标信息指导下建立长距离依赖关系,在充分利用全局空间结构优势的同时,使注意力机制更聚焦稀疏标志点。此外,CoorTransformer可加速收敛,有效规避Transformer在稀疏表征学习中难以收敛的缺陷。通过结合先进的CoorTransformer与中心损失函数,我们构建了能处理多种场景的通用检测模型,该模型能自主挖掘标志点间的潜在关联,并融合目标标志点周围的丰富结构知识。我们在三项具有挑战性的标志点检测任务中对CoorTransformer与中心损失函数进行了分析与评估。实验结果表明,我们的CoorTransformer优于现有最优方法,且中心损失函数显著提升了模型性能(p值<0.05)。