The Prototypical Network (ProtoNet) has emerged as a popular choice in Few-shot Learning (FSL) scenarios due to its remarkable performance and straightforward implementation. Building upon such success, we first propose a simple (yet novel) method to fine-tune a ProtoNet on the (labeled) support set of the test episode of a C-way-K-shot test episode (without using the query set which is only used for evaluation). We then propose an algorithmic framework that combines ProtoNet with optimization-based FSL algorithms (MAML and Meta-Curvature) to work with such a fine-tuning method. Since optimization-based algorithms endow the target learner model with the ability to fast adaption to only a few samples, we utilize ProtoNet as the target model to enhance its fine-tuning performance with the help of a specifically designed episodic fine-tuning strategy. The experimental results confirm that our proposed models, MAML-Proto and MC-Proto, combined with our unique fine-tuning method, outperform regular ProtoNet by a large margin in few-shot audio classification tasks on the ESC-50 and Speech Commands v2 datasets. We note that although we have only applied our model to the audio domain, it is a general method and can be easily extended to other domains.
翻译:原型网络(ProtoNet)因其卓越的性能和简洁的实现方式,在小样本学习(FSL)场景中已成为一种流行的选择。基于这一成功,我们首先提出了一种简单(但新颖)的方法,在C-way-K-shot测试情景的(已标注)支持集上对ProtoNet进行微调(不使用仅用于评估的查询集)。随后,我们提出了一个算法框架,将ProtoNet与基于优化的小样本学习算法(MAML和Meta-Curvature)相结合,以配合这种微调方法工作。由于基于优化的算法赋予目标学习模型快速适应少量样本的能力,我们利用ProtoNet作为目标模型,并通过专门设计的情景微调策略来增强其微调性能。实验结果表明,我们提出的模型MAML-Proto和MC-Proto,结合我们独特的微调方法,在ESC-50和Speech Commands v2数据集上的小样本音频分类任务中,显著超越了常规ProtoNet。需要指出的是,尽管我们目前仅将模型应用于音频领域,但该方法具有通用性,可以轻松扩展到其他领域。