The class-agnostic counting (CAC) problem has caught increasing attention recently due to its wide societal applications and arduous challenges. To count objects of different categories, existing approaches rely on user-provided exemplars, which is hard-to-obtain and limits their generality. In this paper, we aim to empower the framework to recognize adaptive exemplars within the whole images. A zero-shot Generalized Counting Network (GCNet) is developed, which uses a pseudo-Siamese structure to automatically and effectively learn pseudo exemplar clues from inherent repetition patterns. In addition, a weakly-supervised scheme is presented to reduce the burden of laborious density maps required by all contemporary CAC models, allowing GCNet to be trained using count-level supervisory signals in an end-to-end manner. Without providing any spatial location hints, GCNet is capable of adaptively capturing them through a carefully-designed self-similarity learning strategy. Extensive experiments and ablation studies on the prevailing benchmark FSC147 for zero-shot CAC demonstrate the superiority of our GCNet. It performs on par with existing exemplar-dependent methods and shows stunning cross-dataset generality on crowd-specific datasets, e.g., ShanghaiTech Part A, Part B and UCF_QNRF.
翻译:类别无关计数(CAC)问题因其广泛的社会应用场景和严峻的技术挑战,近年来受到越来越多的关注。为统计不同类别的物体,现有方法依赖用户提供的示例,这些示例难以获取且限制了方法的通用性。本文旨在使框架具备识别图像内自适应示例的能力。我们开发了一种零样本广义计数网络(GCNet),该网络采用伪孪生结构,自动有效地从固有的重复模式中学习伪示例线索。此外,我们还提出了一种弱监督方案,以减轻当前所有CAC模型所需的密集标注密度图的负担,使GCNet能够在端到端训练中仅使用计数级监督信号。在不提供任何空间位置提示的情况下,GCNet通过精心设计的自相似性学习策略,能够自适应地捕获这些线索。在零样本CAC基准数据集FSC147上的大量实验和消融研究表明,GCNet具有优越性。其性能与现有依赖示例的方法相当,并在人群计数专用数据集(如ShanghaiTech Part A、Part B和UCF_QNRF)上展现出惊人的跨数据集泛化能力。