The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some of this is also owed to the risks and costs associated with their use. On one front is their tendency to hallucinate false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.
翻译:大型语言模型(LLMs)在日常生活中的日益普及很大程度上归功于其生成能力,但部分原因也与其使用相关的风险和成本有关。一方面,它们倾向于产生虚假或误导性信息的幻觉,限制了其可靠性。另一方面,人们越来越关注传统基于自注意力机制的LLMs所面临的计算限制,这催生了新的替代方案,特别是旨在克服这些限制的循环模型。然而,同时考虑这两个问题的情况仍不常见。架构的改变会加剧还是缓解现有的幻觉问题?它们会影响幻觉发生的方式和位置吗?通过广泛的评估,我们研究了这些基于架构的归纳偏差如何影响产生幻觉的倾向。虽然幻觉仍然是一种普遍现象,并不局限于特定架构,但其发生的具体情况以及诱发特定类型幻觉的难易程度可能因模型架构而有显著差异。这些发现强调,需要更好地将这两个问题结合起来理解,并考虑如何设计更通用的技术来处理幻觉。