Most of the recent few-shot learning algorithms are based on transfer learning, where a model is pre-trained using a large amount of source data, and the pre-trained model is updated using a small amount of target data afterward. In transfer-based few-shot learning, sophisticated pre-training methods have been widely studied for universal and improved representation. However, there is little study on updating pre-trained models for few-shot learning. In this paper, we compare the two popular updating methods, fine-tuning (i.e., updating the entire network) and linear probing (i.e., updating only the linear classifier), considering the distribution shift between the source and target data. We find that fine-tuning is better than linear probing as the number of samples increases, regardless of distribution shift. Next, we investigate the effectiveness and ineffectiveness of data augmentation when pre-trained models are fine-tuned. Our fundamental analyses demonstrate that careful considerations of the details about updating pre-trained models are required for better few-shot performance.
翻译:最近的微小学习算法大多以转让学习为基础,在这种模式中,使用大量源数据进行预先培训,在经过培训的模型中,使用少量的目标数据进行更新。在基于转让的微小学习中,对复杂的培训前方法进行了广泛的研究,以普及和改进代表性。然而,关于更新预培训模式以进行微小学习,几乎没有什么研究。在本文中,我们比较两种流行的更新方法,即微调(即更新整个网络)和线性研究(即仅更新线性分类器),考虑到源和目标数据之间的分布变化。我们发现微调优于线性研究,因为样本数量增加,而不论分布变化如何。接下来,我们调查在经过培训的模型进行微调时,数据增加的有效性和有效性。我们的基本分析表明,改进微小的绩效需要仔细考虑更新预先培训模式的细节。