Object counting and localization problems are commonly addressed with point supervised learning, which allows the use of less labor-intensive point annotations. However, learning based on point annotations poses challenges due to the high imbalance between the sets of annotated and unannotated pixels, which is often treated with Gaussian smoothing of point annotations and focal loss. However, these approaches still focus on the pixels in the immediate vicinity of the point annotations and exploit the rest of the data only indirectly. In this work, we propose a novel approach termed CeDiRNet for point-supervised learning that uses a dense regression of directions pointing towards the nearest object centers, i.e. center-directions. This provides greater support for each center point arising from many surrounding pixels pointing towards the object center. We propose a formulation of center-directions that allows the problem to be split into the domain-specific dense regression of center-directions and the final localization task based on a small, lightweight, and domain-agnostic localization network that can be trained with synthetic data completely independent of the target domain. We demonstrate the performance of the proposed method on six different datasets for object counting and localization, and show that it outperforms the existing state-of-the-art methods. The code is accessible on GitHub at https://github.com/vicoslab/CeDiRNet.git.
翻译:目标计数与定位问题通常采用点监督学习来解决,这种方法允许使用劳动强度较低的点标注。然而,基于点标注的学习面临标注像素与未标注像素集之间高度不平衡的挑战,现有方法通常通过对点标注进行高斯平滑处理并结合焦点损失函数来应对。但这些方法仍主要关注点标注紧邻区域的像素,仅间接利用其余数据。本研究提出一种名为CeDiRNet的新型点监督学习方法,该方法通过密集回归指向最近目标中心的方向向量(即中心方向)进行学习。这为每个中心点提供了来自众多指向目标中心的周边像素的更强支撑。我们提出了一种中心方向的数学表述,使得问题可分解为领域特定的中心方向密集回归任务,以及基于小型、轻量级且领域无关的定位网络的最终定位任务——该网络可使用与目标域完全独立的合成数据进行训练。我们在六个不同的目标计数与定位数据集上验证了所提方法的性能,结果表明其优于现有最先进方法。相关代码已在GitHub开源:https://github.com/vicoslab/CeDiRNet.git。