The development of generative language models that can create long and coherent textual outputs via autoregression has lead to a proliferation of uses and a corresponding sweep of analyses as researches work to determine the limitations of this new paradigm. Unlike humans, these 'Large Language Models' (LLMs) are highly sensitive to small changes in their inputs, leading to unwanted inconsistency in their behavior. One problematic inconsistency when LLMs are used to answer multiple-choice questions or analyze multiple inputs is order dependency: the output of an LLM can (and often does) change significantly when sub-sequences are swapped, despite both orderings being semantically identical. In this paper we present Set-Based Prompting, a technique that guarantees the output of an LLM will not have order dependence on a specified set of sub-sequences. We show that this method provably eliminates order dependency, and that it can be applied to any transformer-based LLM to enable text generation that is unaffected by re-orderings. Delving into the implications of our method, we show that, despite our inputs being out of distribution, the impact on expected accuracy is small, where the expectation is over the order of uniformly chosen shuffling of the candidate responses, and usually significantly less in practice. Thus, Set-Based Prompting can be used as a 'dropped-in' method on fully trained models. Finally, we discuss how our method's success suggests that other strong guarantees can be obtained on LLM performance via modifying the input representations.
翻译:通过自回归生成连贯长文本的生成式语言模型的发展,已导致其用途激增,并引发了相应的分析浪潮,因为研究人员致力于确定这一新范式的局限性。与人类不同,这些"大语言模型"(LLMs)对其输入的微小变化高度敏感,导致其行为出现不必要的不一致性。当LLMs用于回答多项选择题或分析多个输入时,一个存在问题的非一致性是顺序依赖性:尽管两种顺序在语义上完全相同,但当子序列交换时,LLM的输出可能(且经常确实)发生显著变化。本文提出基于集合的提示技术,该技术能保证LLM的输出对指定子序列集合不存在顺序依赖性。我们证明该方法可理论消除顺序依赖,且适用于任何基于Transformer的LLM,从而实现不受重排序影响的文本生成。通过深入探讨该方法的影响,我们证明尽管输入数据分布外,但对期望准确率的影响较小(该期望值基于候选回答均匀随机排列的顺序),且实践中通常影响更微弱。因此,基于集合的提示技术可作为"即插即用"方法应用于完全训练好的模型。最后,我们讨论该方法的成功如何表明,通过修改输入表征可获得关于LLM性能的其他强保证。