We present Connected-Component~(CC)-Metrics, a novel semantic segmentation evaluation protocol, targeted to align existing semantic segmentation metrics to a multi-instance detection scenario in which each connected component matters. We motivate this setup in the common medical scenario of semantic metastases segmentation in a full-body PET/CT. We show how existing semantic segmentation metrics suffer from a bias towards larger connected components contradicting the clinical assessment of scans in which tumor size and clinical relevance are uncorrelated. To rebalance existing segmentation metrics, we propose to evaluate them on a per-component basis thus giving each tumor the same weight irrespective of its size. To match predictions to ground-truth segments, we employ a proximity-based matching criterion, evaluating common metrics locally at the component of interest. Using this approach, we break free of biases introduced by large metastasis for overlap-based metrics such as Dice or Surface Dice. CC-Metrics also improves distance-based metrics such as Hausdorff Distances which are uninformative for small changes that do not influence the maximum or 95th percentile, and avoids pitfalls introduced by directly combining counting-based metrics with overlap-based metrics as it is done in Panoptic Quality.
翻译:我们提出了连通组件(CC)度量,这是一种新颖的语义分割评估协议,旨在将现有语义分割度量标准与多实例检测场景对齐,其中每个连通组件都具有重要意义。我们在全身PET/CT中语义转移瘤分割这一常见医学场景中论证了该设置的合理性。我们展示了现有语义分割度量标准如何存在偏向较大连通组件的偏差,这与临床扫描评估相矛盾,因为在临床评估中肿瘤大小与临床相关性并无关联。为了重新平衡现有分割度量标准,我们提出在基于每个组件的基础上进行评估,从而赋予每个肿瘤相同的权重,无论其大小如何。为了将预测与真实标注片段进行匹配,我们采用基于邻近度的匹配准则,在目标组件局部评估常见度量指标。通过这种方法,我们摆脱了基于重叠的度量标准(如Dice系数或表面Dice系数)因大型转移瘤引入的偏差。CC度量还改进了基于距离的度量标准(如豪斯多夫距离),这些度量对于不影响最大值或第95百分位数的小变化缺乏信息量,同时避免了将基于计数的度量标准与基于重叠的度量标准直接结合(如全景质量度量所做)所引入的缺陷。