Uncertainty Quantification in Large Language Models Through Convex Hull Analysis

Uncertainty quantification approaches have been more critical in large language models (LLMs), particularly high-risk applications requiring reliable outputs. However, traditional methods for uncertainty quantification, such as probabilistic models and ensemble techniques, face challenges when applied to the complex and high-dimensional nature of LLM-generated outputs. This study proposes a novel geometric approach to uncertainty quantification using convex hull analysis. The proposed method leverages the spatial properties of response embeddings to measure the dispersion and variability of model outputs. The prompts are categorized into three types, i.e., `easy', `moderate', and `confusing', to generate multiple responses using different LLMs at varying temperature settings. The responses are transformed into high-dimensional embeddings via a BERT model and subsequently projected into a two-dimensional space using Principal Component Analysis (PCA). The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm is utilized to cluster the embeddings and compute the convex hull for each selected cluster. The experimental results indicate that the uncertainty of the model for LLMs depends on the prompt complexity, the model, and the temperature setting.

翻译：不确定性量化方法在大语言模型（LLMs）中愈发关键，尤其是在需要可靠输出的高风险应用中。然而，传统的概率模型和集成技术等不确定性量化方法，在应用于LLM生成输出的复杂高维特性时面临挑战。本研究提出了一种利用凸包分析进行不确定性量化的新颖几何方法。该方法利用响应嵌入的空间特性来度量模型输出的离散度和变异性。提示被分为三类，即“简单”、“中等”和“混淆”，以在不同温度设置下使用不同LLMs生成多个响应。这些响应通过BERT模型转换为高维嵌入，随后使用主成分分析（PCA）投影到二维空间。采用基于密度的噪声应用空间聚类（DBSCAN）算法对嵌入进行聚类，并为每个选定聚类计算凸包。实验结果表明，LLMs模型的不确定性取决于提示复杂度、模型本身以及温度设置。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/