Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the methodological level, we propose to adopt k-NN with textual representations of PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the training process. (2) Linearly interpolate the probability distribution predicted by k-NN with that of the PLMs' classifier. At the heart of our approach is the implementation of k-NN-calibrated training, which treats predicted results as indicators for easy versus hard examples during the training process. From the perspective of the diversity of application scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings, respectively, across eight diverse end-tasks. We hope our exploration will encourage the community to revisit the power of classical methods for efficient NLP\footnote{Code and datasets are available in https://github.com/zjunlp/Revisit-KNN.
翻译:预训练语言模型(PLMs)作为基于参数的即时学习模型,已成为当前自然语言处理(NLP)范式中的事实标准。相比之下,k近邻(k-NN)分类器作为懒惰学习范式,能够有效缓解过拟合和孤立噪声问题。本文重新审视了k-NN分类器在增强PLMs分类器中的应用。在方法论层面,我们提出分两步将k-NN与PLMs的文本表示相结合:(1)利用k-NN作为先验知识来校准训练过程;(2)将k-NN预测的概率分布与PLMs分类器的概率分布进行线性插值。该方法的核心在于实现k-NN校准训练,即在训练过程中将预测结果作为简单样本与困难样本的判别指标。从应用场景多样性角度出发,我们针对微调、提示微调范式以及零样本、少样本和全监督设置,在八个不同终端任务上开展了广泛实验。希望本研究能激励学界重新审视经典方法在高效NLP中的潜力\footnote{代码与数据集详见https://github.com/zjunlp/Revisit-KNN.}。