In-context learning (ICL), which promotes inference with several demonstrations, has become a widespread paradigm to stimulate LLM capabilities for downstream tasks. Due to context length constraints, it cannot be further improved in spite of more training data, and general features directly from LLMs in ICL are not adaptive to the specific downstream task. In this paper, we propose a feature-adaptive and data-scalable in-context learning framework (FADS-ICL), which can leverage task-adaptive features to promote inference on the downstream task, with the supervision of beyond-context samples. Specifically, it first extracts general features of beyond-context samples via the LLM with ICL input form one by one, and introduces a task-specific modulator to perform feature refinement and prediction after fitting a specific downstream task. We conduct extensive experiments on FADS-ICL under varying data settings (4$\sim$128 shots) and LLM scale (0.8$\sim$70B) settings. Experimental results show that FADS-ICL consistently outperforms previous state-of-the-art methods by a significant margin under all settings, verifying the effectiveness and superiority of FADS-ICL. For example, under the 1.5B and 32 shots setting, FADS-ICL can achieve \textbf{+14.3} average accuracy from feature adaptation over vanilla ICL on 10 datasets, with \textbf{+6.2} average accuracy over the previous state-of-the-art method, and the performance can further improve with increasing training data. Code and data are publicly available at \url{https://github.com/jiahaozhenbang/FADS-ICL}.
翻译:上下文学习(ICL)通过提供若干示例来促进推理,已成为激发大型语言模型(LLM)下游任务能力的广泛范式。受限于上下文长度约束,即使使用更多训练数据,其性能也难以进一步提升;且ICL中直接来自LLM的通用特征无法自适应于特定下游任务。本文提出一种特征自适应与数据可扩展的上下文学习框架(FADS-ICL),该框架能够利用任务自适应特征,在超上下文样本的监督下提升下游任务的推理性能。具体而言,该框架首先通过LLM以ICL输入形式逐个提取超上下文样本的通用特征,随后引入任务特定调制器,在适配特定下游任务后执行特征精化与预测。我们在多样化数据设置(4$\sim$128样本)和LLM规模(0.8$\sim$700亿参数)设置下对FADS-ICL进行了广泛实验。实验结果表明,在所有设置下FADS-ICL均显著优于现有最优方法,验证了其有效性与优越性。例如在1.5B参数和32样本设置下,FADS-ICL在10个数据集上通过特征自适应相比原始ICL实现平均准确率提升\textbf{+14.3},较先前最优方法提升\textbf{+6.2}平均准确率,且性能可随训练数据增加持续提升。代码与数据已公开于\url{https://github.com/jiahaozhenbang/FADS-ICL}。