Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection

Accurate understanding of anatomical structures is essential for reliably staging certain dental diseases. A way of introducing this within semantic segmentation models is by utilising hierarchy-aware methodologies. However, existing hierarchy-aware segmentation methods largely encode anatomical structure through the loss functions, providing weak and indirect supervision. We introduce a general framework that embeds an explicit anatomical hierarchy into semantic segmentation by coupling a recurrent, level-wise prediction scheme with restrictive output heads and top-down feature conditioning. At each depth of the class tree, the backbone is re-run on the original image concatenated with logits from the previous level. Child class features are conditioned using Feature-wise Linear Modulation of their parent class probabilities, to modulate child feature spaces for fine grained detection. A probabilistic composition rule enforces consistency between parent and descendant classes. Hierarchical loss combines per-level class weighted Dice and cross entropy loss and a consistency term loss, ensuring parent predictions are the sum of their children. We validate our approach on our proposed dataset, TL-pano, containing 194 panoramic radiographs with dense instance and semantic segmentation annotations, of tooth layers and alveolar bone. Utilising UNet and HRNet as donor models across a 5-fold cross validation scheme, the hierarchical variants consistently increase IoU, Dice, and recall, particularly for fine-grained anatomies, and produce more anatomically coherent masks. However, hierarchical variants also demonstrated increased recall over precision, implying increased false positives. The results demonstrate that explicit hierarchical structuring improves both performance and clinical plausibility, especially in low data dental imaging regimes.

翻译：精确理解解剖结构对于可靠分期某些牙科疾病至关重要。在语义分割模型中引入解剖结构的一种方法是利用层次感知方法。然而，现有的层次感知分割方法主要通过损失函数编码解剖结构，仅提供弱间接监督。我们提出一个通用框架，通过将循环逐级预测方案与限制性输出头及自上而下的特征条件化相结合，将显式解剖层次结构嵌入语义分割模型。在类别树的每个深度层级，主干网络会在原始图像与前一层级逻辑输出的拼接结果上重新运行。子类特征通过其父类概率的特征级线性调制进行条件化处理，以调制子类特征空间实现细粒度检测。概率组合规则强制保持父类与后代类别间的一致性。分层损失函数结合了层级加权Dice损失、交叉熵损失及一致性约束损失，确保父类预测值为其子类预测之和。我们在自建数据集TL-pano上验证了该方法，该数据集包含194张全景X光片，带有牙齿层次及牙槽骨的密集实例与语义分割标注。采用UNet和HRNet作为基础模型进行五折交叉验证，分层变体模型持续提升了交并比、Dice系数和召回率，尤其对细粒度解剖结构效果显著，并生成解剖一致性更高的掩码。然而，分层变体也表现出召回率增长高于精确度的特点，意味着假阳性有所增加。结果表明，显式层次结构建模能同时提升模型性能与临床合理性，在数据有限的牙科影像场景中尤为显著。