While learning with limited labelled data can improve performance when the labels are lacking, it is also sensitive to the effects of uncontrolled randomness introduced by so-called randomness factors (e.g., varying order of data). We propose a method to systematically investigate the effects of randomness factors while taking the interactions between them into consideration. To measure the true effects of an individual randomness factor, our method mitigates the effects of other factors and observes how the performance varies across multiple runs. Applying our method to multiple randomness factors across in-context learning and fine-tuning approaches on 7 representative text classification tasks and meta-learning on 3 tasks, we show that: 1) disregarding interactions between randomness factors in existing works caused inconsistent findings due to incorrect attribution of the effects of randomness factors, such as disproving the consistent sensitivity of in-context learning to sample order even with random sample selection; and 2) besides mutual interactions, the effects of randomness factors, especially sample order, are also dependent on more systematic choices unexplored in existing works, such as number of classes, samples per class or choice of prompt format.
翻译:有限标注数据学习在标签不足时能提升性能,但也容易受到所谓随机性因素(如数据顺序变化)引入的不可控随机效应的影响。我们提出了一种系统研究随机性因素影响的方法,同时考虑了它们之间的交互作用。为测量单个随机性因素的真实效应,我们的方法减弱了其他因素的影响,并观察多次运行间性能的变化情况。将我们的方法应用于上下文学习和微调方法中的多个随机性因素,在7个代表性文本分类任务及元学习的3个任务上,我们证明:1)现有研究中忽视随机性因素间的交互作用,导致因错误归因随机性效应而得出不一致结论,例如反驳了上下文学习对样本顺序的持续敏感性(即使采用随机样本选择);2)除相互交互外,随机性因素的效应(尤其样本顺序)还依赖于现有研究未探索的更系统性选择,如类别数量、每类样本数或提示格式的选择。