Although demonstrating superb performance on various NLP tasks, large language models (LLMs) still suffer from the hallucination problem, which threatens the reliability of LLMs. To measure the level of hallucination of LLMs, previous works first categorize the hallucination according to the phenomenon similarity, then quantify the proportion that model outputs contain hallucinatory contents. However, such hallucination rates could easily be distorted by confounders. Moreover, such hallucination rates could not reflect the reasons for the hallucination, as similar hallucinatory phenomena may originate from different sources. To address these issues, we propose to combine the hallucination level quantification and hallucination reason investigation through an association analysis, which builds the relationship between the hallucination rate of LLMs with a set of risk factors. In this way, we are able to observe the hallucination level under each value of each risk factor, examining the contribution and statistical significance of each risk factor, meanwhile excluding the confounding effect of other factors. Additionally, by recognizing the risk factors according to a taxonomy of model capability, we reveal a set of potential deficiencies in commonsense memorization, relational reasoning, and instruction following, which may further provide guidance for the pretraining and supervised fine-tuning process of LLMs to mitigate the hallucination.
翻译:尽管大语言模型(LLMs)在各类自然语言处理任务中展现出卓越性能,但其仍存在威胁可靠性的幻觉问题。为衡量LLMs的幻觉程度,以往研究首先根据现象相似性对幻觉进行分类,进而量化模型输出中包含幻觉内容的比例。然而,这种幻觉率易受混杂因素干扰,且由于相似幻觉现象可能源于不同成因,无法反映幻觉产生的根本原因。针对这些问题,我们提出通过关联分析将幻觉水平量化与成因探究相结合的方法,建立LLMs幻觉率与一组风险因素之间的关联关系。由此,我们得以观测各风险因素在不同取值下的幻觉水平,既检验每个风险因素的贡献度与统计显著性,又排除了其他因素的混杂效应。此外,通过基于模型能力分类体系识别风险因素,我们揭示了常识记忆、关系推理和指令遵循三方面的潜在缺陷,这可为LLMs预训练和监督微调过程提供缓解幻觉的指导方案。