Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks. Current research focuses on enhancing their performance within their existing knowledge. Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend. Therefore, the ability to understand their own limitations on the unknows, referred to as self-knowledge, is of paramount importance. This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions. We introduce an automated methodology to detect uncertainty in the responses of these models, providing a novel measure of their self-knowledge. We further introduce a unique dataset, SelfAware, consisting of unanswerable questions from five diverse categories and their answerable counterparts. Our extensive analysis, involving 20 LLMs including GPT-3, InstructGPT, and LLaMA, discovering an intrinsic capacity for self-knowledge within these models. Moreover, we demonstrate that in-context learning and instruction tuning can further enhance this self-knowledge. Despite this promising insight, our findings also highlight a considerable gap between the capabilities of these models and human proficiency in recognizing the limits of their knowledge.
翻译:大型语言模型(LLMs)拥有丰富的知识,使其在各种自然语言处理任务中表现出色。当前研究主要聚焦于提升其在现有知识范围内的性能。然而,尽管知识渊博,LLMs受限于其所能容纳和理解的信息量。因此,理解自身对未知知识的局限(即自我认知)至关重要。本研究旨在通过评估LLMs识别不可回答或不可知问题的能力,检验其自我认知水平。我们提出一种自动化方法,用于检测模型回应中的不确定性,从而提供一种衡量其自我认知的新指标。进一步地,我们构建了独特的数据集SelfAware,包含来自五个不同类别的不可回答问题及其对应的可回答问题。通过对包括GPT-3、InstructGPT和LLaMA在内的20个LLM进行广泛分析,我们发现这些模型具备内在的自我认知能力。此外,我们证明上下文学习与指令微调可进一步增强这种自我认知。尽管这一发现令人鼓舞,但研究结果也表明,这些模型在识别知识边界的能力上与人类水平仍存在显著差距。