Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks. Current research focuses on enhancing their performance within their existing knowledge. Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend. Therefore, the ability to understand their own limitations on the unknows, referred to as self-knowledge, is of paramount importance. This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions. We introduce an automated methodology to detect uncertainty in the responses of these models, providing a novel measure of their self-knowledge. We further introduce a unique dataset, SelfAware, consisting of unanswerable questions from five diverse categories and their answerable counterparts. Our extensive analysis, involving 20 LLMs including GPT-3, InstructGPT, and LLaMA, discovering an intrinsic capacity for self-knowledge within these models. Moreover, we demonstrate that in-context learning and instruction tuning can further enhance this self-knowledge. Despite this promising insight, our findings also highlight a considerable gap between the capabilities of these models and human proficiency in recognizing the limits of their knowledge.
翻译:大语言模型(LLMs)拥有丰富的知识,使其在各种自然语言处理(NLP)任务中表现出色。当前研究主要聚焦于增强其现有知识范围内的性能。尽管知识渊博,LLMs仍受限于其所能容纳和理解的信息量。因此,理解自身对未知领域的局限——即自我认知能力——至关重要。本研究旨在通过评估LLMs识别不可回答或不可知问题的能力,来检验其自我认知水平。我们提出了一种自动化方法,用于检测这些模型回答中的不确定性,从而为其自我认知提供新的度量指标。此外,我们引入了一个独特的数据集SelfAware,该数据集包含来自五个不同类别的不可回答问题及其对应的可回答问题。通过对GPT-3、InstructGPT和LLaMA等20个LLMs的广泛分析,我们发现这些模型内在地具备自我认知能力。进一步研究表明,上下文学习和指令调优可以增强这种自我认知。尽管这一发现令人鼓舞,但我们的结果也凸显出这些模型在识别自身知识局限性方面与人类能力之间的显著差距。