Concept shift is a prevailing problem in natural tasks like medical image segmentation where samples usually come from different subpopulations with variant correlations between features and labels. One common type of concept shift in medical image segmentation is the "information imbalance" between label-sparse samples with few (if any) segmentation labels and label-dense samples with plentiful labeled pixels. Existing distributionally robust algorithms have focused on adaptively truncating/down-weighting the "less informative" (i.e., label-sparse in our context) samples. To exploit data features of label-sparse samples more efficiently, we propose an adaptively weighted online optimization algorithm -- AdaWAC -- to incorporate data augmentation consistency regularization in sample reweighting. Our method introduces a set of trainable weights to balance the supervised loss and unsupervised consistency regularization of each sample separately. At the saddle point of the underlying objective, the weights assign label-dense samples to the supervised loss and label-sparse samples to the unsupervised consistency regularization. We provide a convergence guarantee by recasting the optimization as online mirror descent on a saddle point problem. Our empirical results demonstrate that AdaWAC not only enhances the segmentation performance and sample efficiency but also improves the robustness to concept shift on various medical image segmentation tasks with different UNet-style backbones.
翻译:概念漂移是医学图像分割等自然任务中普遍存在的问题,这些任务中的样本通常来自不同子群体,其特征与标签之间存在变化的关联。医学图像分割中一种常见的概念漂移类型是"信息不平衡"——即标签稀疏样本(几乎没有分割标签)与标签密集样本(拥有大量带标签像素)之间的不平衡。现有的分布鲁棒算法主要聚焦于自适应截断/降低"信息量较少"样本(即本文情境中的标签稀疏样本)的权重。为了更高效地利用标签稀疏样本的数据特征,我们提出了一种自适应加权在线优化算法——AdaWAC——将数据增强一致性正则化融入样本重加权过程中。该方法引入一组可训练权重,分别平衡每个样本的监督损失与无监督一致性正则化。在目标函数的鞍点处,权重将标签密集样本分配给监督损失,而将标签稀疏样本分配给无监督一致性正则化。我们通过将优化过程重新表述为鞍点问题上的在线镜像下降,提供了收敛性保证。实验结果表明,在采用不同UNet风格骨干网络的多种医学图像分割任务中,AdaWAC不仅提升了分割性能与样本效率,还增强了对概念漂移的鲁棒性。