Hyperbolic Active Learning for Semantic Segmentation under Domain Shift

For the task of semantic segmentation (SS) under domain shift, active learning (AL) acquisition strategies based on image regions and pseudo labels are state-of-the-art (SoA). The presence of diverse pseudo-labels within a region identifies pixels between different classes, which is a labeling efficient active learning data acquisition strategy. However, by design, pseudo-label variations are limited to only select the contours of classes, limiting the final AL performance. We approach AL for SS in the Poincar\'e hyperbolic ball model for the first time and leverage the variations of the radii of pixel embeddings within regions as a novel data acquisition strategy. This stems from a novel geometric property of a hyperbolic space trained without enforced hierarchies, which we experimentally prove. Namely, classes are mapped into compact hyperbolic areas with a comparable intra-class radii variance, as the model places classes of increasing explainable difficulty at denser hyperbolic areas, i.e. closer to the Poincar\'e ball edge. The variation of pixel embedding radii identifies well the class contours, but they also select a few intra-class peculiar details, which boosts the final performance. Our proposed HALO (Hyperbolic Active Learning Optimization) surpasses the supervised learning performance for the first time in AL for SS under domain shift, by only using a small portion of labels (i.e., 1%). The extensive experimental analysis is based on two established benchmarks, i.e. GTAV $\rightarrow$ Cityscapes and SYNTHIA $\rightarrow$ Cityscapes, where we set a new SoA. The code will be released.

翻译：针对域偏移下的语义分割任务，基于图像区域与伪标签的主动学习采集策略是当前最先进的方法。区域内多样化的伪标签能够识别不同类别之间的像素，这是一种标注高效的主动学习数据采集策略。然而，由于设计限制，伪标签的变异性仅局限于选择类别的轮廓，从而制约了主动学习的最终性能。我们首次在庞加莱双曲球模型框架下探索语义分割主动学习，并提出利用区域内像素嵌入半径的变异性作为新型数据采集策略。该策略源于一个未经强制层次化训练的双曲空间新颖几何特性（经实验验证）：模型将可解释难度递增的类别映射至密度更大的双曲区域（即靠近庞加莱球边缘），使得各类别被压缩在双曲空间中具有相似类内半径方差的紧凑区域。像素嵌入半径的变异性不仅能精准识别类别轮廓，还能选取部分类内特殊细节，从而提升最终性能。我们提出的HALO（双曲主动学习优化）首次在域偏移下的语义分割主动学习中仅使用少量标签（即1%）即超越监督学习性能。基于GTAV→Cityscapes与SYNTHIA→Cityscapes两个权威基准的广泛实验分析中，我们建立了新的最先进水平。代码将公开。