While learning with limited labelled data can improve performance when the labels are lacking, it is also sensitive to the effects of uncontrolled randomness introduced by so-called randomness factors (e.g., varying order of data). We propose a method to systematically investigate the effects of randomness factors while taking the interactions between them into consideration. To measure the true effects of an individual randomness factor, our method mitigates the effects of other factors and observes how the performance varies across multiple runs. Applying our method to multiple randomness factors across in-context learning and fine-tuning approaches on 7 representative text classification tasks and meta-learning on 3 tasks, we show that: 1) disregarding interactions between randomness factors in existing works caused inconsistent findings due to incorrect attribution of the effects of randomness factors, such as disproving the consistent sensitivity of in-context learning to sample order even with random sample selection; and 2) besides mutual interactions, the effects of randomness factors, especially sample order, are also dependent on more systematic choices unexplored in existing works, such as number of classes, samples per class or choice of prompt format.
翻译:尽管在标签稀缺时,利用有限标注数据进行学习能够提升性能,但该方法对由所谓随机性因素(如数据顺序变化)引入的不可控随机效应同样敏感。我们提出一种系统研究方法,在考量随机性因素间交互作用的同时,探究其影响效应。为准确度量单个随机性因素的真实效应,本方法通过抑制其他因素的干扰,观测多次运行中性能指标的变化规律。将本方法应用于7个代表性文本分类任务的上下文学习与微调场景,以及3个任务的元学习场景中的多重随机性因素分析,我们发现:1)现有研究忽视随机性因素间的交互作用,导致因错误归因而产生不一致结论,例如即使采用随机样本选择,仍无法证实上下文学习对样本顺序具有一致敏感性;2)除相互交互外,随机性因素(尤其是样本顺序)的效应还依赖于现有研究未深入探讨的系统性选择,例如类别数量、每类样本数或提示格式的选择。