NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging

Privacy and memory are two recurring themes in a broad conversation about the societal impact of AI. These concerns arise from the need for huge amounts of data to train deep neural networks. A promise of Generalized Few-shot Object Detection (G-FSOD), a learning paradigm in AI, is to alleviate the need for collecting abundant training samples of novel classes we wish to detect by leveraging prior knowledge from old classes (i.e., base classes). G-FSOD strives to learn these novel classes while alleviating catastrophic forgetting of the base classes. However, existing approaches assume that the base images are accessible, an assumption that does not hold when sharing and storing data is problematic. In this work, we propose the first data-free knowledge distillation (DFKD) approach for G-FSOD that leverages the statistics of the region of interest (RoI) features from the base model to forge instance-level features without accessing the base images. Our contribution is three-fold: (1) we design a standalone lightweight generator with (2) class-wise heads (3) to generate and replay diverse instance-level base features to the RoI head while finetuning on the novel data. This stands in contrast to standard DFKD approaches in image classification, which invert the entire network to generate base images. Moreover, we make careful design choices in the novel finetuning pipeline to regularize the model. We show that our approach can dramatically reduce the base memory requirements, all while setting a new standard for G-FSOD on the challenging MS-COCO and PASCAL-VOC benchmarks.

翻译：隐私与记忆是人工智能社会影响广泛讨论中的两大主题。这些担忧源于训练深度神经网络需要海量数据。广义小样本目标检测（G-FSOD）作为一种人工智能学习范式，旨在通过利用旧类（即基类）的先验知识，减少收集待检测新类大量训练样本的需求。G-FSOD致力于在学习这些新类的同时缓解对基类的灾难性遗忘。然而，现有方法假设基类图像可访问，这一假设在数据共享和存储存在困难时并不成立。本文提出首个面向G-FSOD的无数据知识蒸馏（DFKD）方法，通过利用基模型中感兴趣区域（RoI）特征的统计信息，在不访问基类图像的情况下伪造实例级特征。我们的贡献体现在三个方面：（1）设计了一个独立的轻量级生成器，并配备（2）类别级别的头部模块，用于（3）在新类数据微调过程中生成并回放多样化的实例级基类特征到RoI头部。这与标准图像分类DFKD方法（需反演整个网络以生成基类图像）形成鲜明对比。此外，我们在新类微调流程中进行了精细的设计选择以正则化模型。实验表明，本方法能在显著降低基类内存需求的同时，在具有挑战性的MS-COCO和PASCAL-VOC基准上为G-FSOD设立新标准。