In this paper, we study a practical yet challenging task, On-the-fly Category Discovery (OCD), aiming to online discover the newly-coming stream data that belong to both known and unknown classes, by leveraging only known category knowledge contained in labeled data. Previous OCD methods employ the hash-based technique to represent old/new categories by hash codes for instance-wise inference. However, directly mapping features into low-dimensional hash space not only inevitably damages the ability to distinguish classes and but also causes "high sensitivity" issue, especially for fine-grained classes, leading to inferior performance. To address these issues, we propose a novel Prototypical Hash Encoding (PHE) framework consisting of Category-aware Prototype Generation (CPG) and Discriminative Category Encoding (DCE) to mitigate the sensitivity of hash code while preserving rich discriminative information contained in high-dimension feature space, in a two-stage projection fashion. CPG enables the model to fully capture the intra-category diversity by representing each category with multiple prototypes. DCE boosts the discrimination ability of hash code with the guidance of the generated category prototypes and the constraint of minimum separation distance. By jointly optimizing CPG and DCE, we demonstrate that these two components are mutually beneficial towards an effective OCD. Extensive experiments show the significant superiority of our PHE over previous methods, e.g., obtaining an improvement of +5.3% in ALL ACC averaged on all datasets. Moreover, due to the nature of the interpretable prototypes, we visually analyze the underlying mechanism of how PHE helps group certain samples into either known or unknown categories. Code is available at https://github.com/HaiyangZheng/PHE.
翻译:本文研究一个实用且具有挑战性的任务——实时类别发现(OCD),其目标是在线识别新到达的流数据中属于已知类别和未知类别的样本,仅利用标注数据中包含的已知类别知识。现有的OCD方法采用基于哈希的技术,通过哈希码表示新旧类别以进行实例级推断。然而,直接将特征映射到低维哈希空间不仅不可避免地损害了类别区分能力,还会引发"高敏感性"问题,尤其对于细粒度类别,导致性能下降。为解决这些问题,我们提出了一种新颖的原型哈希编码(PHE)框架,该框架包含类别感知原型生成(CPG)和判别性类别编码(DCE)两个模块,通过两阶段投影的方式在降低哈希码敏感性的同时,保留高维特征空间中的丰富判别信息。CPG通过为每个类别生成多个原型,使模型能够充分捕捉类内多样性。DCE在生成的类别原型引导和最小分离距离约束下,提升哈希码的判别能力。通过联合优化CPG和DCE,我们证明这两个组件能够相互促进,实现有效的OCD。大量实验表明,我们的PHE方法相比先前方法具有显著优势,例如在所有数据集上平均ALL ACC指标提升了+5.3%。此外,得益于可解释原型的特性,我们通过可视化分析了PHE如何将特定样本分组到已知或未知类别的内在机制。代码发布于https://github.com/HaiyangZheng/PHE。