In few-shot learning, the selection of samples has a significant impact on the performance of the model. While effective sample selection strategies are well-established in supervised settings, research on large language models largely overlooks them, favouring strategies specifically tailored to individual in-context learning settings. In this paper, we propose a new method for Automatic Combination of SamplE Selection Strategies (ACSESS) to leverage the strengths and complementarity of various well-established selection objectives. We investigate and compare the impact of 23 sample selection strategies on the performance of 5 in-context learning models and 3 few-shot learning approaches (meta-learning, few-shot fine-tuning) over 6 text and 8 image datasets. The experimental results show that the combination of strategies through the ACSESS method consistently outperforms all individual selection strategies and performs on par or exceeds the in-context learning specific baselines. Lastly, we demonstrate that sample selection remains effective even on smaller datasets, yielding the greatest benefits when only a few shots are selected, while its advantage diminishes as the number of shots increases.
翻译:在小样本学习中,样本选择对模型性能具有显著影响。尽管监督学习场景中已建立了有效的样本选择策略,但面向大语言模型的研究往往忽视这些策略,转而青睐针对特定上下文学习场景定制的方案。本文提出了一种新型自动组合样本选择策略方法(ACSESS),旨在利用多种成熟选择目标的优势与互补性。我们系统探究并比较了23种样本选择策略对5种上下文学习模型及3种小样本学习方法(元学习、小样本微调)在6个文本数据集与8个图像数据集上的性能影响。实验结果表明,通过ACSESS方法组合策略始终优于所有单一选择策略,其性能与专门针对上下文学习设计的基线方法持平甚至更优。最后,我们证明样本选择在小型数据集上依然有效,当选取少量样本时效益最为显著,而其优势会随样本数量增加逐渐减弱。