DenseSeg: Joint Learning for Semantic Segmentation and Landmark Detection Using Dense Image-to-Shape Representation

Purpose: Semantic segmentation and landmark detection are fundamental tasks of medical image processing, facilitating further analysis of anatomical objects. Although deep learning-based pixel-wise classification has set a new-state-of-the-art for segmentation, it falls short in landmark detection, a strength of shape-based approaches. Methods: In this work, we propose a dense image-to-shape representation that enables the joint learning of landmarks and semantic segmentation by employing a fully convolutional architecture. Our method intuitively allows the extraction of arbitrary landmarks due to its representation of anatomical correspondences. We benchmark our method against the state-of-the-art for semantic segmentation (nnUNet), a shape-based approach employing geometric deep learning and a convolutional neural network-based method for landmark detection. Results: We evaluate our method on two medical dataset: one common benchmark featuring the lungs, heart, and clavicle from thorax X-rays, and another with 17 different bones in the paediatric wrist. While our method is on pair with the landmark detection baseline in the thorax setting (error in mm of $2.6\pm0.9$ vs $2.7\pm0.9$), it substantially surpassed it in the more complex wrist setting ($1.1\pm0.6$ vs $1.9\pm0.5$). Conclusion: We demonstrate that dense geometric shape representation is beneficial for challenging landmark detection tasks and outperforms previous state-of-the-art using heatmap regression. While it does not require explicit training on the landmarks themselves, allowing for the addition of new landmarks without necessitating retraining.}

翻译：目的：语义分割与标志点检测是医学图像处理的基础任务，有助于解剖对象的进一步分析。尽管基于深度学习的逐像素分类方法为分割任务设立了新的技术标杆，但在标志点检测方面存在不足，而这正是基于形状方法的优势所在。方法：本研究提出一种密集图像到形状表示方法，通过采用全卷积架构实现标志点与语义分割的联合学习。该方法通过表示解剖对应关系，能够直观地提取任意标志点。我们将本方法与语义分割的先进方法（nnUNet）、采用几何深度学习的基于形状方法以及基于卷积神经网络的标志点检测方法进行了基准比较。结果：我们在两个医学数据集上评估了本方法：一个包含胸部X射线中肺、心脏和锁骨的常用基准数据集，另一个包含儿科手腕17块不同骨骼的数据集。在胸部数据集中，本方法与标志点检测基线性能相当（误差为$2.6\pm0.9$毫米 vs $2.7\pm0.9$毫米），但在更复杂的手腕数据集中显著超越基线（$1.1\pm0.6$毫米 vs $1.9\pm0.5$毫米）。结论：我们证明密集几何形状表示对于具有挑战性的标志点检测任务具有优势，其性能优于先前基于热图回归的先进方法。该方法无需对标志点本身进行显式训练，允许添加新标志点而无需重新训练。