Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data

Few-shot learning involves learning an effective model from only a few labeled datapoints. The use of a small training set makes it difficult to avoid overfitting but also makes few-shot learning applicable to many important real-world settings. In this work, we focus on Few-shot Learning with Auxiliary Data (FLAD), a training paradigm that assumes access to auxiliary data during few-shot learning in hopes of improving generalization. Introducing auxiliary data during few-shot learning leads to essential design choices where hand-designed heuristics can lead to sub-optimal performance. In this work, we focus on automated sampling strategies for FLAD and relate them to the explore-exploit dilemma that is central in multi-armed bandit settings. Based on this connection we propose two algorithms -- EXP3-FLAD and UCB1-FLAD -- and compare them with methods that either explore or exploit, finding that the combination of exploration and exploitation is crucial. Using our proposed algorithms to train T5 yields a 9% absolute improvement over the explicitly multi-task pre-trained T0 model across 11 datasets.

翻译：少样本学习旨在仅利用少量标注数据点训练有效模型。小规模训练集虽难以避免过拟合，却使少样本学习适用于诸多重要实际场景。本研究聚焦于"带辅助数据的少样本学习"（FLAD）训练范式，该范式假设在少样本学习过程中可访问辅助数据以提升泛化性能。在少样本学习中引入辅助数据会产生关键设计决策，人工设计的启发式规则可能导致次优表现。本研究针对FLAD自动采样策略展开探索，将其与多臂赌博机场景中的核心"探索-利用困境"相关联。基于此关联，我们提出两种算法——EXP3-FLAD与UCB1-FLAD，并与纯探索或纯利用方法进行对比，发现探索与利用的结合至关重要。采用所提算法训练T5模型，在11个数据集上相较显式多任务预训练的T0模型实现9%的绝对性能提升。

相关内容

小样本学习

关注 0

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日