The recent performance leap of Large Language Models (LLMs) opens up new opportunities across numerous industrial applications and domains. However, erroneous generations, such as false predictions, misinformation, and hallucination made by LLMs, have also raised severe concerns for the trustworthiness of LLMs', especially in safety-, security- and reliability-sensitive scenarios, potentially hindering real-world adoptions. While uncertainty estimation has shown its potential for interpreting the prediction risks made by general machine learning (ML) models, little is known about whether and to what extent it can help explore an LLM's capabilities and counteract its undesired behavior. To bridge the gap, in this paper, we initiate an exploratory study on the risk assessment of LLMs from the lens of uncertainty. In particular, we experiment with twelve uncertainty estimation methods and four LLMs on four prominent natural language processing (NLP) tasks to investigate to what extent uncertainty estimation techniques could help characterize the prediction risks of LLMs. Our findings validate the effectiveness of uncertainty estimation for revealing LLMs' uncertain/non-factual predictions. In addition to general NLP tasks, we extensively conduct experiments with four LLMs for code generation on two datasets. We find that uncertainty estimation can potentially uncover buggy programs generated by LLMs. Insights from our study shed light on future design and development for reliable LLMs, facilitating further research toward enhancing the trustworthiness of LLMs.
翻译:近年来大语言模型(LLMs)的性能飞跃为众多工业应用领域带来了新机遇。然而,LLMs产生的错误生成(如虚假预测、错误信息和幻觉)也引发了对LLMs可信度的严重担忧,尤其是在安全、安防和可靠性敏感场景中,这可能阻碍其实际部署应用。尽管不确定性估计在解释通用机器学习(ML)模型预测风险方面展现了潜力,但关于它能否以及能在多大程度上帮助探索LLM能力并抑制其不良行为,目前仍知之甚少。为弥合这一研究空白,本文从不确定性视角出发,首次对LLM风险评估展开探索性研究。具体而言,我们在四项主流自然语言处理(NLP)任务中,采用十二种不确定性估计方法和四种LLM进行实验,以探究不确定性估计技术能在多大程度上刻画LLM的预测风险。研究结果验证了不确定性估计在揭示LLM不确定/非事实预测方面的有效性。除通用NLP任务外,我们还基于两个数据集对四种LLM进行了代码生成任务的广泛实验,发现不确定性估计具有揭示LLM生成错误程序的潜力。本研究的洞见将为构建可靠LLM的未来设计与开发提供启示,促进提升LLM可信度的后续研究。