ESPT: A Self-Supervised Episodic Spatial Pretext Task for Improving Few-Shot Learning

Self-supervised learning (SSL) techniques have recently been integrated into the few-shot learning (FSL) framework and have shown promising results in improving the few-shot image classification performance. However, existing SSL approaches used in FSL typically seek the supervision signals from the global embedding of every single image. Therefore, during the episodic training of FSL, these methods cannot capture and fully utilize the local visual information in image samples and the data structure information of the whole episode, which are beneficial to FSL. To this end, we propose to augment the few-shot learning objective with a novel self-supervised Episodic Spatial Pretext Task (ESPT). Specifically, for each few-shot episode, we generate its corresponding transformed episode by applying a random geometric transformation to all the images in it. Based on these, our ESPT objective is defined as maximizing the local spatial relationship consistency between the original episode and the transformed one. With this definition, the ESPT-augmented FSL objective promotes learning more transferable feature representations that capture the local spatial features of different images and their inter-relational structural information in each input episode, thus enabling the model to generalize better to new categories with only a few samples. Extensive experiments indicate that our ESPT method achieves new state-of-the-art performance for few-shot image classification on three mainstay benchmark datasets. The source code will be available at: https://github.com/Whut-YiRong/ESPT.

翻译：自监督学习（SSL）技术近来被整合到少样本学习（FSL）框架中，并在改进少样本图像分类性能方面展现出良好前景。然而，现有用于FSL的自监督方法通常从每张单幅图像的全局嵌入中寻找监督信号。因此，在FSL的情景训练过程中，这些方法无法捕捉并充分利用图像样本中的局部视觉信息以及整个情景的数据结构信息，而这些信息对FSL有益。为此，我们提出通过一种新颖的自监督情景空间预任务（ESPT）来增强少样本学习目标。具体而言，对于每个少样本情景，我们对其中的所有图像施加随机几何变换，生成对应的变换后情景。基于此，我们的ESPT目标定义为最大化原始情景与变换后情景之间的局部空间关系一致性。通过这一定义，ESPT增强的FSL目标促进了更具可迁移性的特征表示学习，这些表示能捕捉不同图像的局部空间特征以及每个输入情景中图像间的相互关系结构信息，从而使模型能够仅凭少量样本就更好地泛化到新类别。大量实验表明，我们的ESPT方法在三个主流基准数据集上的少样本图像分类中取得了新的最优性能。源代码将发布于：https://github.com/Whut-YiRong/ESPT。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

专知会员服务

19+阅读 · 2022年2月2日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

43+阅读 · 2020年4月11日