Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP). Although convenient for research and practical applications, open-source LLMs with fewer parameters often suffer from severe hallucinations compared to their larger counterparts. This paper focuses on measuring and reducing hallucinations in BLOOM 7B, a representative of such weaker open-source LLMs that are publicly available for research and commercial applications. We introduce HaloCheck, a lightweight BlackBox knowledge-free framework designed to quantify the severity of hallucinations in LLMs. Additionally, we explore techniques like knowledge injection and teacher-student approaches to alleviate hallucinations in low-parameter LLMs. Our experiments effectively demonstrate the reduction of hallucinations in challenging domains for these LLMs.
翻译:摘要:大型语言模型(LLMs)已彻底改变自然语言处理(NLP)领域。尽管开源LLMs因其参数规模较小而在研究和实际应用中颇具便利性,但与规模更大的模型相比,这类模型往往存在更严重的幻觉问题。本文聚焦于BLOOM 7B——这一代表自主流开源弱LLMs、可用于研究与商业应用的模型——对其幻觉程度进行量化与缓解。我们提出HaloCheck,一种轻量级无知识库的黑盒框架,用于评估LLMs中幻觉的严重程度。此外,我们探索了知识注入和师生方法等技术,以缓解低参数LLMs中的幻觉问题。实验有效证明,在具有挑战性的领域中,这些方法能够显著减少此类LLMs的幻觉现象。