Discovering fine-grained categories from coarsely labeled data is a practical and challenging task, which can bridge the gap between the demand for fine-grained analysis and the high annotation cost. Previous works mainly focus on instance-level discrimination to learn low-level features, but ignore semantic similarities between data, which may prevent these models learning compact cluster representations. In this paper, we propose Denoised Neighborhood Aggregation (DNA), a self-supervised framework that encodes semantic structures of data into the embedding space. Specifically, we retrieve k-nearest neighbors of a query as its positive keys to capture semantic similarities between data and then aggregate information from the neighbors to learn compact cluster representations, which can make fine-grained categories more separatable. However, the retrieved neighbors can be noisy and contain many false-positive keys, which can degrade the quality of learned embeddings. To cope with this challenge, we propose three principles to filter out these false neighbors for better representation learning. Furthermore, we theoretically justify that the learning objective of our framework is equivalent to a clustering loss, which can capture semantic similarities between data to form compact fine-grained clusters. Extensive experiments on three benchmark datasets show that our method can retrieve more accurate neighbors (21.31% accuracy improvement) and outperform state-of-the-art models by a large margin (average 9.96% improvement on three metrics). Our code and data are available at https://github.com/Lackel/DNA.
翻译:从粗粒度标注数据中发现细粒度类别是一项实用且具有挑战性的任务,它能够弥合细粒度分析需求与高标注成本之间的鸿沟。以往工作主要关注实例级判别以学习低层特征,但忽略了数据间的语义相似性,这可能导致模型无法学习紧凑的聚类表征。本文提出去噪邻域聚合(DNA)——一种将数据语义结构编码到嵌入空间的自监督框架。具体而言,我们检索查询样本的k近邻作为其正类键值以捕获数据间的语义相似性,继而聚合邻域信息以学习紧凑的聚类表征,从而使细粒度类别更具可分性。然而,检索到的邻域可能包含噪声和大量伪正键值,这会降低所学嵌入的质量。为应对这一挑战,我们提出三项原则来滤除这些伪邻居以改善表征学习。此外,我们从理论上证明该框架的学习目标等价于聚类损失函数,能够捕获数据间的语义相似性以形成紧凑的细粒度聚类。在三个基准数据集上的大量实验表明,我们的方法能检索到更准确的邻居(准确率提升21.31%),并以较大优势超越现有最优模型(三项指标平均提升9.96%)。代码与数据已开源至 https://github.com/Lackel/DNA。