Large Language Models (LLMs) have shown great potential in Natural Language Processing (NLP) tasks. However, recent literature reveals that LLMs generate nonfactual responses intermittently, which impedes the LLMs' reliability for further utilization. In this paper, we propose a novel self-detection method to detect which questions that a LLM does not know that are prone to generate nonfactual results. Specifically, we first diversify the textual expressions for a given question and collect the corresponding answers. Then we examine the divergencies between the generated answers to identify the questions that the model may generate falsehoods. All of the above steps can be accomplished by prompting the LLMs themselves without referring to any other external resources. We conduct comprehensive experiments and demonstrate the effectiveness of our method on recently released LLMs, e.g., Vicuna, ChatGPT, and GPT-4.
翻译:大语言模型在自然语言处理任务中展现出巨大潜力。然而近期研究表明,大语言模型会间歇性生成非事实性回答,这阻碍了其可靠性的进一步提升。本文提出一种新颖的自检测方法,用于识别大语言模型因知识盲区而可能产生非事实性结果的问题。具体而言,我们首先对给定问题进行多样化文本表达并收集对应答案,随后通过分析生成答案之间的差异性,识别模型可能产生虚假回答的问题。上述所有步骤均可通过提示大语言模型自身完成,无需借助任何外部资源。我们在近期发布的大语言模型(如Vicuna、ChatGPT、GPT-4)上开展了全面实验,验证了本方法的有效性。